Disaster recovery reference architecture: Skytap region-to-region – cold recovery
The following diagram shows a high-level overview of a disaster recovery workflow in Skytap.
In this example, a second Skytap region is used as a cold disaster recovery location for an application running in Skytap. For an overview of the various disaster recovery scenarios, see Disaster recovery reference architectures.
This disaster recovery workflow uses several Skytap features.
-
Create a VPN connection between your Skytap regions.
For example, run a firewall appliance in each Skytap region, and use a Skytap VPN to create a secure connection between the regions.
The VPN connection can be used to send updated application files to your recovery environment or to sync the databases in the two environments.
For more information, see Managing VPNs and Private Network Connections.
-
Create a template of your application stack in a second Skytap region.
-
Use our copy-to-region feature to easily create a copy of your virtual environment in another Skytap region.
For more information, see Copying an environment to another region.
-
Create a template of the environment.
This becomes your golden template for creating a new, working environment in a disaster recovery scenario. For more information, see Saving an environment as a template.
-
Share the template with any Skytap users who will need access to it during disaster recovery.
-
Add the virtual environment to a project, and give access to any users who need to view, use, or edit the environment. Users can have different levels of access, depending on their project role.
Give the project an easily identifiable name, such as Disaster recovery resources.
For more information, see Sharing resources with projects.
-
If you use a Private Network Connection for secure access between your corporate network and your primary Skytap region, set up a Private Network Connection between your corporate network and your recovery region. For more information, see Managing VPNs and Private Network Connections.
-
-
-
Periodically update the copy of your application stack in the recovery region.
On a schedule based on your business requirements:
- Create an environment from your read-only template.
-
Connect the new environment to the VPN, and deploy operating system patches, application updates, and other changes as needed. Use your standard configuration management tooling (Ansible, Puppet, etc.) to update the VMs that had been suspended with the latest operating system and application updates.
Alternately, if you have a regular maintenance period for your application, you can suspend the primary environment and copy it to your recovery region.
- Attach a public IP address to the network appliance in the environment and perform a “smoke test” of the recovery environment. Use your standard test process to validate that the application works as expected and that it functions properly as a disaster recovery environment.
- If your disaster recovery plan includes recovering data from a separate, more-frequently updated data store, remove any unneeded data from the Skytap virtual machines. This helps reduce the storage space consumed by the Skytap template.
-
Save the updated environment as a new template.
Give the template a unique and discoverable name, such as ApplicationName_DR_2018_04_01.
- Add the new template to the project you created earlier. Validate that the project members have the appropriate level of access to create environments from the template.
During disaster recovery cutover
Follow your organization’s documented disaster recovery procedures to:
- Create an environment from the most recent template.
- Deploy operating system patches, application updates, and other changes as needed. Restore data, if needed.
- Attach a public IP address to the network appliance in the environment.
- Update your DNS service to direct traffic to the Skytap public IP address.
When failing over to an alternate region, you may see different application response times. Users who are further away from the recovery location may experience slower response times, due to network latency.
For other disaster recovery examples, see Disaster recovery reference architectures.