Green Storage: MAID To Do More Than Just Spin Down

The fundamental Green problem of all data centers is that they cost a fortune to power up, which in turn produces heat which then costs a fortune to cool down. Within this vicious circle a bastion was set up to counter this, namely ‘Green Storage’, which has mostly taken shape in the form of virtualized storage, data deduplication, compression, thin provisioning, DC power and SSDs. Add to the circle power conservation techniques such as server virtualization and archiving onto nearline storage, and you have most companies claiming they are successfully moving towards being ‘Green’. Hence bring forth the proposition of MAID storage and many users would not see the need for it. Further reluctance towards the technology would also come from the fact that MAID has now somewhat tragically become synonymous with being merely a disk spin down technique, despite having the potential to be far more and concurrently bringing greater cost savings.


First coined around 2002, MAID (a massive array of idle disks) held a promise to only utilize 25% or less of its hard drives at any one time. As Green technology became fashionable, major storage vendors such as EMC, HDS, Fujitsu and NEC began incorporating MAID technology promising that drives could power down when they weren't in use, thus extending the lifecycle of cheap SATA drives and in turn reducing the costs of running the data center. But caught in the midst of being one of many features that the larger vendors were trumpeting, the development and progress of MAID failed to advance from its disk slowdown tag, leaving Data Center and Storage managers oblivious of its full potential. Furthermore MAID became associated with being a solution only suited for backups and archive applications as users were cautious of the longer access time that resulted as disks spun up after being idle.


Fast forward to 2010, with government regulations demanding more power savings, and the back of a year which saw data grow while budgets shrank, suddenly users are looking for further ways to maximize the efficiency of their data storage systems. Hence in a world where persistent data increases a hero in the vein of MAID may just be ideal.


To be frank, the detractors did have a point with the original MAID 1.0 concept which in essence was simply to stop spinning non-accessed disk drives. Also with a MAID LUN having to use an entire RAID group, the prospect of a user with less than a large amount of data meant an awful lot of wasted storage. Add in the scenario of data put on MAID that suddenly requires more user access and hence constant disk spin and the overall cost savings became miniscule. Therefore those that did go for MAID ended up utilizing the technology for situations where access requirements and data retrieval were not paramount, i.e. backup and archiving.


In retrospect what often gets overlooked is that even with tier 2 and tier 3 storage data only a fraction is frequently accessed therefore leaving MAID as a suitable counterpart to the less-active data sets. In conclusion the real crux of the matter is the potential access time overhead that occurs as disks have to be started up, which is a given when only one spin down level is available.


Now with updated ‘MAID 2.0’ technologies such as AutoMAID from Nexsan, varying levels of disk-drive ‘spin down’ are available which utilize LUN access history to adjust the MAID levels accordingly. With Level 0 you have hard drive full-spin mode, with full power consumption and the shortest data access time while Level 1 allows the unloading of disk read/write heads giving 15%-20% less than Level 0 in power usage and only a fraction of a second less in access time. Additionally you have Level 2, which not only unloads the disk heads but also slows the platters a further 30-50% from full speed, giving a 15 second range for access time on the initial I/O before being jolting up to full speed. Similar to MAID 1.0, Level 3 allows the disk platters to stop spinning; bringing power consumption down by 60%-70% with an access time of 30-45 seconds on the initial I/O. In a nutshell these various levels of MAID now open up the doors for the technology to be a viable option for both tier 2, 3 and 4 storage data without the apprehension of delayed access times.


Some companies have gone even further with the technology by adding the ability to dedupe and replicate data in its libraries. Thus users have the option to isolate drives from the MAID pool and dedicate others for cache, leaving the cache drives to continuously spin while simultaneously increasing the payback of deduplication. The possibilities for remote replication as well as policy-based tiering and migrations are obvious. An organization with a sound knowledge of their applications could make significant savings moving data off expensive tier 1 disks to a MAID technology that incorporates both deduplication and replication capabilities with minimum if any performance loss.


Moreover using MAID technology in a context where data becomes inactive during the night (user directories, CRM databases etc.), disks can easily be spun down when users leave their office. Saving on unnecessary spin cost and energy for numerous hours each evening, by also using an automated process for long periods of inactivity such as holiday periods, users would quickly increase energy savings as well as decrease man management costs.


No doubt that in the current mainstream MAID is still best suited for persistent data that's not static and depends largely upon accurate data classification practices. But once MAID 2 and its features of variable drive spin-down, deduplication and replication begin to get the attention they deserve, we may well see a ‘Green’ solution which really does bring significant cost savings and energy savings. With the real ‘Green’ concern of most IT Directors being that of the paper kind with Benjamin Franklin’s face on, that attention may just occur sooner than we think.