Operating your business on the public cloud means that you're sharing responsibility with the cloud IaaS provider to keep your uptime, well, up. Typically cloud providers will provide tools, such as fail-over zones and VM shadows, to help you run services resiliently. But those are tools you have to wield properly for optimal effect. For some services, like file server backup and disaster recovery, the path of best practices is pretty well-paved with existing solutions. However, running end-user computing – one of your most mission-critical workloads - with high resiliency in the public cloud has important considerations.
What’s keeping CIOs up at night?
Downtime, that's not your fault, yet your business leaders blame you. End-user computing has long been an on-premises workload – addressed either with physical PCs and workstations, or via do-it-yourself VDI managed by an in-house IT team or a managed service provider (MSP). In the quest for zero downtime, IT organizations try to plan for all possible outages. This is just good business continuity planning—anticipating various disruption scenarios and then determining how each will be addressed. Most organizations have at least a basic plan in place for their on-premises infrastructure, including backup and DR procedures, with processes and playbooks honed and documented by IT.
This was all well and good until 2020, when the pandemic happened, and the definition of mission-critical computing expanded overnight to include end-user computing for a huge percentage of remote workers. Suddenly, downtime and productivity loss became acute and even existential for many businesses, yet IT teams found that the existing playbooks for avoiding business disruption no longer applied.
Some organizations attempted to support their newly remote workforce with a VPN approach, though many ran into performance bottlenecks and security concerns, especially as the realization dawned that supporting remote work was not likely a short-term challenge. Other companies expanded their use of on-premises VDI, with varying degrees of success. And many companies looked to the cloud as a long-term solution to support a flexible “work from anywhere” strategy going forward. However, if you move all of your critical services, including end-user computing, into a single cloud region, then you have simply moved the same outage risk from your on-premises infrastructure to a single cloud infrastructure.
The case for a multi-region strategy for end-user computing
The primary reason to consider a multi-region cloud strategy is resilience. You have a certain amount of control over many problems that could arise by centralizing desktops in the cloud, but a particularly challenging risk is the possibility that an entire public cloud region could experience an outage. Though this is unusual, it can happen –and it has happened – resulting in business disruption. Organizations moving desktop workloads to the public cloud need to plan for this eventuality just as they have planned for other outage scenarios.
So, what happens to end-user computing when an entire cloud region goes down? If you don’t have the right cloud desktop solution, your users can’t access their cloud desktops, and that means the business is on life support until service is restored. For some industries, such as financial services, the cost of this lost productivity can be millions of dollars per hour. This is an essential consideration as you think about the ROI of the solution options you’re evaluating.
For most organizations, Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are measured in days. In general, the cloud can offer a much more reliable way to provide users resilient access to virtual desktops because of their built-in BC/DR capabilities; users can access their desktops from anywhere. In addition, some solutions offer “stand-by” cloud desktops that can be activated by your IT team in minutes when disaster strikes. And if you’ve chosen a solution that can also quickly failover to an alternate cloud region in the event the primary cloud region goes down, you've achieved the gold standard of resilient end-user computing. Now, you can have an RTO measured in minutes and an RPO of <24 hours. In fact, the cost for a solution that affords this impressive recovery time is probably dramatically lower than the cost of productivity losses for an RTO that is measured in days.
Why multi-cloud for resilient end-user computing?
When organizations maintained on-premises data centers, those with high availability requirements would typically employ redundant networking and data centers to reduce risk. Traditionally, end-user computing was not part of that multi-datacenter strategy because VDI doesn’t scale well horizontally. It is too expensive, complex, and difficult to maintain across multiple data centers. However, as cloud strategies mature, many organizations now require a multi-cloud approach because they have to consider the possibility of a provider-wide outage, and they need to avoid being overly reliant on a single provider. Fortunately, in the era of the public cloud, deploying cloud desktops across multiple cloud providers – as long as it can be managed from a single console – can dramatically simplify IT’s ability to align the goal of flexible end-user computing with broader organizational goals for resiliency.
Getting started with multi-cloud end-user computing
It makes good sense and is fast becoming the norm to choose cloud vendors based on an organization's priorities and assign some workloads to one cloud vendor while other workloads are deployed in another vendor's cloud. Solutions such as Google Anthos and Azure Arc are great examples of core technologies that make it easier for organizations to execute their multi-cloud strategies. Similarly, multi-cloud end-user computing must be simple for IT too.
It stands to reason that organizations would improve resilience by deploying cloud desktops across multiple cloud regions or cloud vendors, giving them better RTO and RPO than on-premises solutions. But the last thing IT needs is more complexity, and most virtual desktop solutions are fraught with complexity. Instead, look for a desktop-as-a-service (DaaS) platform that enables backup to an alternate cloud region, which protects against regional cloud outages. If you want to leverage multiple clouds for even more resilience and to double-down on low-latency desktop access, consider solutions that centralize management into a single console. Otherwise, it’s complicated to keep track of the global user experience and stay ahead of any issues that may arise. Even the cloud incarnations of legacy VDI, where the broker is hosted in the cloud, can’t deliver this kind of simplified management. It’s an underlying architectural deficiency that prevents them from achieving this for customers.
The resilient cloud desktop strategy
Adoption of new technology brings with it costs and benefits. In the case of a multi-cloud strategy, the cost is usually added complexity. While IT leaders normally have time to mull over the pros and cons of new tools and options, the pandemic afforded no such luxury. Remote work had to ramp up quickly, and many organizations found their approach lacked the performance, scalability, flexibility, and security they needed. A multi-region or multi-cloud DaaS approach to cloud desktops can help IT teams deliver reliable service to end users while minimizing complexity and strengthening security – with the ongoing value of flexibly supporting the need to work from anywhere. Use the recommendations above to craft a cloud strategy tailored to your unique needs.
Jimmy Chang is chief product officer at Workspot.