Power consumption is a major problem for enterprise data centers, impacting the density of servers and the total cost of ownership. This is causing changes in data center configuration and management. Some components already support power management features: for example, server CPUs can use low-power states and dynamic clock and voltage scaling to reduce power consumption significantly during idle periods. Enterprise storage subsystems do not have such advanced power management and consume a significant amount of power in the data center. An enterprise grade disk such as the Seagate Cheetah 15K.4 consumes 12W even when idle whereas a dual-core Intel Xeon processor consumes 24W when idle. Thus, an idle machine with one dual-core processor and two disks already spends as much power on disks as processors. For comparison, the core servers in our building's data center have more than 13 disks per machine on average.
Simply buying fewer disks is usually not an option, since this would reduce peak performance and/or capacity. The alternative is to spin down disks when they are not in use. The traditional view is that idle periods in server workloads are too short for this to be effective. However our analysis of real server workloads show that there is in fact substantial idle time at the storage volume level. We would also expect, and previous work has confirmed, that main-memory caches are effective at absorbing reads but not writes. Thus we would expect at the storage level to see periods where all the traffic is write traffic. Our analysis shows that this is indeed true, and that the request stream is write-dominated for a substantial fraction of time.
This analysis motivated a technique that we call write off-loading, which allows blocks written to one volume to be redirected to other storage elsewhere in the data center. During periods which are write-dominated, the disks are spun down and the writes are redirected, causing some of the volume's blocks to be off-loaded. Blocks are off-loaded temporarily, for a few minutes up to a few hours, and are reclaimed lazily in the background after the home volume's disks are spun up.
Write off-loading modifies the per-volume access patterns, creating idle periods during which all the volume's disks can be spun down. For our traces this causes volumes to be idle for 79 of the time on average. The cost of doing this is that when a read occurs for a non-off-loaded block, it incurs a significant latency while the disks spin up. However, our results show that this occurs rarely.
Write off-loading is implemented at the block level and is transparent to file systems and applications running on the servers. Blocks can be off-loaded from any volume to any available persistent storage in the data center, either on the same machine or on a different one. The storage could be based on disks, NVRAM, or solid-state memory such as flash. Off-loading uses spare capacity and bandwidth on existing volumes and thus does not require provisioning of additional storage. Write off-loading is also applicable to a variety of storage architectures. Our trace analysis and evaluation are based on a Direct Attached Storage (DAS) model, where each server is attached directly to a set of disks, typically configured as one or more RAID arrays. DAS is typical for small data centers such as those serving a single office building. Write off-loading can also be applied to network attached storage (NAS) and storage area networks (SANs).