Drilling Down To The Data Problem Inside The Data Center
Legacy data centers have an efficiency problem, with multiple systems processing the same data several times. It's time to rethink that.
August 20, 2015
IT departments are under amplified pressure to increase efficiency, which generally means changing the way that IT operates -- anything from small course corrections to major initiatives. Storage efficiency has typically referred to the processes resulting in reduced storage and bandwidth capacity requirements.
Compression, thin provisioning, data deduplication and even storage virtualization have had a huge impact on storage, IT efficiency and, ultimately, the total cost of ownership (TCO) of enterprise storage. These technologies are pervasive in data center services, such as production storage, backup, WAN optimization and archiving.
In today's post-virtualization data center, virtualized workloads with different IO streams are sharing the physical resources of the host. This results in random IO streams competing for resources -- and the emergence of new efficiency requirements as the IOPS required to service virtual workloads has increased. Some band-aids applied to the IOPS problem include over-provisioning HDDs or investment in SSDs/flash. Both of these contribute to a higher cost per gigabyte of storage allocated to each virtual machine.
Data efficiency capitalizes on the familiar storage efficiency technologies, such as compression and deduplication, but executes them in a way that positively impacts both capacity and IOPS -- and costs -- in today's modern data centers. This is one of the key benefits of a hyperconverged infrastructure, which at the highest level, is a way to enable cloudlike economics and scale without compromising the performance, reliability and availability expected in a data center.
For example, in a legacy data center, all of the business applications and infrastructure applications run within the same shared x86 resource pool. If each infrastructure application is processing the data separately, then the same data will be processed again and again.
There is nothing efficient about processing the same data nine or more different times. It is a huge waste of CPU resources within the infrastructure, requiring more CPU and memory, and likely more hosts within the environment. In the end, this approach does not deliver the expected cost savings. Taking a different approach to infrastructure functionality can allow organizations to solve what is known as "the data problem."
Consider the flu. One approach to feeling better is to take different medications for each of your symptoms: body aches, cough, congestion, sore throat, headache, etc. In the end, the problem has not actually been addressed. The patient has taken nine or more different medications, but still has the flu.
Instead of treating the "symptoms" within the data center with many different bolt-on technologies, we must treat the flu -- in this case, the data problem. Many data centers use up to a dozen different products, and each of these process data separately. It is possible that some organizations do not use up to a dozen different products, but within a legacy data center, any existing data is likely being processed multiple times in its lifecycle. Reducing that can provide significant cost savings.
The most important thing for IT professionals to consider when examining the pros and cons of an existing data center is to take a look at the entire picture. It is no secret that taking the necessary steps to redefine simplicity is important, but data efficiency is an equally vital component of a well-designed infrastructure.
About the Author
You May Also Like