Here are some tips for evaluating your cloud storage options.
Amazon Web Services, Google, and Azure dominate the cloud service provider space, but for some applications it may make sense to choose a smaller provider specializing in your app class and able to deliver a finer-tuned solution. No matter which cloud provider you choose, it pays to look closely at the wide variety of cloud storage services they offer to make sure they will meet your company's requirements.
There are two major classes of storage with the big cloud providers, which offer local instance storage with selected instances, as well as a selection of network storage options for permanent storage and sharing between instances.
As with any storage, performance is a factor in your decision-making process. There are many shared network storage alternatives, including storage tiers from really hot to freezing cold and within the top tiers, differences depending on choice of replica count, and variations in prices for copying data to other spaces.
The very hot tier is moving to SSD and even here there are differences between NVMe and SATA SSDs, which cloud tenants typically see as IOPS levels. For large instances and GPU-based instances, the faster choice is probably better, though this depends on your use case.
At the other extreme, the cold and “freezing” storage, the choices are disk or tape, which impacts data retrieval times. With tape, that can take as much as two hours, compared with just seconds for disk.
Data security and vendor reliability are two other key considerations when choosing a cloud provider that will store your enterprise data. Continue on to get tips for your selection process.
Local instance storage
Pricing for local instance storage is a key issue, complicated by a smorgasbord of instance types, sizes and prices. There is a price war going on, which intensifies or wanes depending on cloud providers' infrastructure cost/performance as well as demand. What doesn’t hit the headlines much are the big differences in cloud provider offerings. Local instance storage, bundled with the compute instances, can vary in latency or performance measured in IOPS or MBPS, and usually has a metered limit on performance as well.
Similarly priced instances may deliver considerably different results depending on the cloud provider. The variation can be considerable, perhaps as much as 4:1, so using price as the sole selection arbiter isn't effective. Most instance storage is heading towards SSD rather than HDD, so performance will range from decent to very good for most use cases.
Overall, you may be handling, and paying for, more instances than are optimal. Figuring out the best choice will usually involve some modelling and sandboxing.
(Image: Carsten Reisinger/Shutterstock)
The next area to look at is the storage ecosystem. Cloud providers don’t operate “virtual iron” anymore. Rather, they offer many storage services, including database structures and tools for big data, as well as the common block, file and object storage. These services may create an environment for a sandbox test. Note that providers don’t always offer all these services on their cloud; they may offer them through third parties.
Diving deeper, each cloud provider has a set of basic management tools for the likes of scripting, image management, and security and, most importantly, setting up virtual networks and storage. Since these will be a major part of each admin's universe once a choice is made, get the admins to sandbox those tools and rate them for ease of use and completeness.
Hybrid cloud interface
Part of this toolkit should handle the interface to your private cloud, if you are among the majority of users planning hybrid cloud solutions. Here’s where some specialist cloud storage providers shine, since they focus on delivering low latency in both the private and public segments of your cloud, something that is difficult to achieve by just directly linking cloud storage to in-house storage. These smaller cloud service providers may actually use one of the big three as their storage site, which is OK since there is added value with caching as well as compression and deduplication tools that save a lot of raw space and cost.
The cost of compute
This may seem out of place in an article focused on cloud storage, but the wrong storage choices could really push up the instance count or force larger DRAM or Vcore components. My recommendation is to conduct a test of your actual apps on each shortlisted provider, using instances and storage that seem most relevant and rightsized for your apps, compare the results, then calculate three measurements:
- The run time of the test multiplied by the cost of the instances used
- The cost to store your data and then to access, use or transfer it
- The average latency to get your data with the chosen storage options
In each case, lowest is best. There are other things to take into account, including lifetime data management though different tiers of cloud provider storage, but the three metrics above should get you to a shortlist of two or three suppliers. If you have one, build these results into your TCO calculation.
(Image: Petr Kopka/Shutterstock)
Cloud storage options also vary in performance, latency and cost of data transfer. Cloud providers typically have three or more storage tiers ranging from fast local instance storage using HDDs or, increasingly, SSDs, to “ice-cold” archival storage using tape or disk. Performance of tape-based AWS Glacier to Google disk-based Coldline storage compares to tape versus disk, with time to access data is two hours for AWS versus 10 seconds for Google. Price is not everything!
Vendors differ when it comes to storage performance and latency. This reflects underlying architectures and things like caching, flash storage, and software. These all interact in a complex way, but the net is that any given app will see differences depending on the provider. Consequently, you should measure storage performance in the sandbox tests, too.
A big selling point with public clouds is that you can build a very resilient solution inexpensively, using multi-zone operations and storage. Even so, there have been some notable outages, but post-mortems generally point to admin errors at the cloud provider and typically recovery is swift when customers use multi-zone storage.
Multi-zone operation adds some cost, which varies depending on the instance and storage types chosen. It adds some complexity to TCO calculations, but it is absolutely a necessary part of your setup.
(Image: Timofeev Vladimir/Shutterstock)
If you don’t encrypt your data, now is the time to start! Cloud providers are beginning to provide encryption options for data at rest, though they may only offer a version that's with provider-owned keys that's non-compliant with regulations such as HIPAA, and SOX. A few have evolved to a compliant storage model supporting user-owned keys.
Cloud providers don’t publish roadmaps for their businesses. Even so, some consistently stand out for innovation and so can be expected to keep moving ahead of the pack over time. The impact is, eventually, lower prices, faster instances and more service features. The leaders will also embrace technologies such as Docker and software-defined infrastructure before most of the industry, using them to further drive down costs.
This is something to take into account in your TCO calculation. Today’s prices could drop within 30 days!