Improving Application Environments in the Cloud

Running mission-critical applications in the cloud is possible with the right strategy and architecture.

April 18, 2022

4 Min Read

Improving Application Environments in the Cloud

(Source: Pixabay)

Many organizations managing their own infrastructures have tried to create high availability (HA) application environments—in which applications will be accessible no less than 99.99% of the time—by using multiple servers or virtual machines (VMs) configured as a failover cluster. If the cluster node running a mission-critical application goes down, for example, a secondary node in the failover cluster can take over in an instant and pick up where the other node left off.

Such failover clusters typically rely on a storage area network (SAN) for shared data storage. But a shared SAN itself constitutes a single point of failure that can compromise high availability. If the SAN goes down, the SQL Server or Oracle database supporting your mission-critical systems is unavailable, and it doesn't matter how many nodes in the failover cluster might be poised to interact with it.

For organizations considering the cloud for a HA application environment, there’s an even more pressing problem: While some cloud vendors do offer shared storage options, not all options guarantee 99.99% availability.

Does that mean that you need to abandon the cloud as an option for a HA application environment? No: It just means you need to rethink how you configure a failover cluster.

Understanding High Availability in the Cloud

The first thing to understand about the cloud is that it’s very easy to spin up and configure new VMs. In fact, Azure, AWS, and Google all make it easy to create high availability clusters comprised of multiple VMs running in different data centers—also known as zones or availability zones. By configuring your VMs in multiple zones, you eliminate the risk that a zone-wide catastrophe could take down all your critical infrastructure.

But if you read the service level agreements (SLAs) from the major cloud providers, you’ll notice a critical caveat: If you configure your HA cluster with VMs in multiple zones, the SLAs guarantee that you’ll be able to access at least one of those nodes at least 99.99% of the time. It doesn’t guarantee that your application will be operative, only that you’ll be able to access one of the VMs.

That’s a critical distinction that harkens back to the problem with a SAN: it doesn’t matter how many VMs you can access if your applications can’t access your data.

This brings us back to the issue of rethinking how you configure a failover cluster in the cloud.

If you expect any VM in your cloud-based failover cluster to be able to take over your production workloads in the event of a failure—and that is why you've deployed an HA solution to begin with—you need to configure each VM in your failover cluster with its own storage. Moreover, you need a mechanism that will actively replicate the data in storage on the active cluster node to the secondary nodes. That way, if the active VM goes offline for any reason, the cluster can failover to a secondary VM, which has all the data needed to enable your application to come back online in a matter of seconds.

There are a variety of data replication solutions that can provide the services your organization will need to ensure true application high availability in a cloud-based deployment. Look for synchronous, block-level replication services, to start. Synchronous services will ensure that any transactions written to storage on the primary system will also be written to storage on the secondary systems before the transaction is considered complete. Block-level replication services are also important because they will ensure that any data written to primary storage will be replicated to secondary storage. If your primary cloud infrastructure supports more than one application or if you use that storage as a repository for multiple applications, all that data—not just the data associated with your Oracle or SQL Server database—will be replicated to the secondary infrastructure, where it will be available to any applications or users if the secondary infrastructure is unexpectedly called into service.

Dave Bermingham

Dave Bermingham is the Senior Technical Evangelist at SIOS Technology. He is recognized within the technology community as a high-availability expert and has been honored to be elected a Microsoft MVP for the past 12 years: 6 years as a Cluster MVP and six years as a Cloud and Datacenter Management MVP. Dave holds numerous technical certifications and has more than thirty years of IT experience, including in finance, healthcare, and education.

Related Topics

Recent in Infrastructure

Related Topics

Recent in Network Mgmt

Related Topics

Recent in Security

Related Topics

Recent in Enterprise Connectivity

Related Topics

Recent in Wireless

Related Topics

Recent in Careers

Related Topics

Improving Application Environments in the Cloud

Understanding High Availability in the Cloud

About the Author(s)

Related Topics

Recent in Infrastructure

Related Topics

Recent in Network Mgmt

Related Topics

Recent in Security

Related Topics

Recent in Enterprise Connectivity

Related Topics

Recent in Wireless

Related Topics

Recent in Careers

Related Topics

Improving Application Environments in the Cloud

Understanding High Availability in the Cloud

Share Data, not Storage

About the Author(s)