Data Storage Can Become Green

There are many steps you can take to reduce the cost and environmental impact of your growing data stores

April 22, 2009

7 Min Read
Network Computing logo

Part eight in a series. Greg Schulz is the founder of StorageIO and the author of The Green and Virtual Data Center.

External data storage, after cooling for all IT equipment and server energy usage, has the next largest impact on power, cooling, floor space, and environmental (PCFE) considerations in most data centers. In addition to being one of the large users of electrical power and floor space, with corresponding environmental impact, the amount of data being stored and the size of the data footprint continue to expand. Likewise, to say that there is no such thing as a data recession, other than from a pricing or revenue pressure perspective, would be an understatement.

As has been reported and discussed on the Byte and Switch message board and in other venues, storage spending may be down year-over-year in conjunction with economic and other budget pressures. However, storage capacity and I/O performance demands continue to grow, resulting in an expanding data footprint impact. Though more data can be stored in the same or smaller physical footprint than in the past, thus requiring less power and cooling, data growth rates necessary to sustain business growth, enhanced IT service delivery, and new applications are placing continued demands on available PCFE resources.

There are many approaches to addressing PCFE issues associated with storage, from using faster, more energy efficient storage that performs more work with less energy, to powering down storage that is supporting inactive data, such as backup or archive data, when it is not in use. While adaptive and intelligent power management techniques are increasingly being found in servers and workstations, power management for storage has lagged behind.

General steps to doing more with your storage-related resources without impeding application service availability, capacity, or performance include:

  • Assess and gain insight as to what you have and how it is being used.

  • Develop a strategy and plan (near-term and long-term) for deployment.

  • Use energy-effective data storage solutions (both hardware and software).

  • Optimize data and storage management functions.

  • Shift usage habits to allocate and use storage more effectively.

  • Reduce your data footprint and the subsequent impact on data protection.

  • Balance performance, availability, capacity, and energy consumption.

  • Change buying habits to focus on effectiveness.

  • Measure, reassess, adjust, and repeat the process.

You can improve storage PCFE efficiency in a number of ways: Spin down and power off HDDs when not in use; reduce power consumption by putting HDDs into a slower mode; do more work and store more data with less power; use Flash and random-access memory (RAM), and solid-state disks (SSDs); consolidate to larger-capacity storage devices and storage systems; use RAID levels and tiered storage to maximize resource usage; leverage management tools and software to balance resource usage; reducing your data footprint via archiving, compression, and de-duplication; use space-saving snapshots, replication, thin-provisioning and tiered storage; and reduce your data footprint with archiving, compression, and de-dupe.

Yet another approach is simply to remove or mask the problems. For example, one way to address increased energy or cooling costs, emissions or carbon taxes if applicable, higher facilities cost if floor space is constrained is to outsource to a cloud or managed service provider or use co-location or hosting facilities. As is often the case, your specific solution may include different elements and other approaches in various combinations, depending on your business size and environment complexity.Another approach to address PCFE and storage cost issues is to deploy a comprehensive data footprint reduction strategy combining various techniques and technologies to address point needs as well as the overall environment, including online, near-line for backup, and offline for archive data. Data footprint reduction and space saving optimization technologies include archiving and data management, on-line and off-line compression as well as de-dupe, among others.

Avoiding energy use can be part of an approach to address PCFE challenges, particularly for servers, storage, and networks that do not need to be used or accessible at all times. However, not all applications, data or workloads can be consolidated or powered down due to performance, availability, capacity, security, compatibility, politics, financial and many other reasons. For those applications, servers, storage and I/O networks, the trick is to support those in a more efficient and effective means. Simply put, when work needs to be done or information needs to be stored or retrieved or data moved, it should be done so in the most energy-efficient manner aligned to a given level of service.

Technology alignment -- aligning the applicable type of storage or server resource and devices to the task at hand to meet application service requirements -- is essential to archiving an optimized and efficient IT environment. For example, for very I/O intensive active data, leveraging high-performance SSD (Flash or RAM) Tier-0 storage would be applicable, or for high I/O active data, using tier-1 fast 15.5K SAS and Fibre Channel storage based systems.

For active and on-line data, that's where energy efficiency in the form of fast disk drives including RAM SSD or Flash SSD (for reads, writes are another story) and in particular fast 15.5K or 10K FC and SAS energy efficient disks and their associated storage systems come into play. The focus for active data and storage systems should be around more useful work per unit of energy consumed in a given footprint. For example, more IOPS per watt, more transactions per watt, more bandwidth or video streams per watt, more files or emails processed per watt.

For low-performance, low activity applications where the focus is around storing as much data as possible with the lowest cost, including for disk-to-disk based backup, slower high-capacity SATA-based storage systems are the fit. For long-term bulk storage to meet archiving, data retention or other retention needs, as well as storing large weekly or monthly full backups, tape is the ticket with the best combination of performance, availability capacity and energy efficiency per footprint.Using more energy-efficient solutions that are capable of doing more work per unit of energy consumed is similar to improving the energy efficiency of an automobile. Leveraging virtualization techniques and technologies provides management transparency and abstraction across different tiers, categories and types of storage to meet various application service requirements for active and inactive or idle data. Keep performance, availability, capacity, and energy (PACE) in balance to meet application service requirements and avoid introducing performance bottlenecks in your quest to reduce or maximize your existing IT resources including power and cooling.

While aggregation and pooling are growing in popularity in terms of deployment, most current storage virtualization solutions are forms of abstraction. Abstraction and technology transparency for enabling business agility include device emulation, interoperability, coexistence, backward compatibility, transition to new technology with transparent data movement and migration, as well as support for high availability and BC/DR. Some other forms of virtualization in the form of abstraction and transparency include heterogeneous data replication or mirroring (local and remote), snapshots, backup, data archiving, security, and compliance.

Virtual tape libraries (VTLs) provide abstraction of underlying physical disk drives while emulating tape drives, tape-handling robotics and tape cartridges. The benefit is that VTLs provide compatibility with existing backup, archive, or data protection software and procedures to improve performance using disk-based technologies. VTLs are available in standalone as well as clustered configuration for availability and failover, as well as scaling for performance and capacity. Interfaces include block-based for tape emulation and NAS for file system-based backups. VTLs also support functions such as compression, de-duplication, encryption, replication, and tiered storage.

Action and takeaway points include the following:

  • Develop a data footprint reduction strategy for online and offline data.

  • Energy avoidance can be accomplished by powering down storage.

  • Energy efficiency can be accomplished by using tiered storage to meet different needs.

  • Measure and compare storage based on idle and active workload conditions.

  • Storage efficiency metrics include IOPS or bandwidth per watt for active data.

  • Storage capacity per watt per footprint and cost is a measure for inactive data.

  • Align the applicable form of virtualization for the task at hand.

    Greg Schulz is the founder of StorageIO, an IT industry research and consulting firm. He has worked as a programmer, systems administrator, disaster recovery consultant, and capacity planner for various IT organizations, and also has held positions with industry vendors. He is author of the new book "The Green and Virtual Data Center (CRC) and of the SNIA-endorsed book "Resilient Storage Networks (Elsevier)".

    InformationWeek has published an in-depth report on data center unification. Download the report here (registration required).

Read more about:

2009
SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights