The virtualisation of the data center has prompted a virtualisation wave in storage, as has the rise of cloud data centers which has spawned new types of cloud storage models.
According to IDC, cloud infrastructure spending grew 23% in the third quarter of 2015, with public-cloud storage spending growing 26.7% year-on-year.
The landscape of the data center is shifting again as virtualised environments morph into cloud environments. Cloud environments embrace the virtual disk model pioneered in virtualisation, and provide additional models to enable a fully virtualised storage stack. Cloud environments attempt to virtualise the entire storage stack so that they can provide self-service and a clean separation between infrastructure and application.
Cloud environments come in many forms. They can be implemented by enterprises as private clouds using environments like OpenStack, CloudStack, and the VMware vRealize suite. They can also be implemented by service providers as public clouds such as Amazon Web Services, Microsoft Azure, and Rackspace.
Interestingly, the storage models used in cloud environments mirror those seen in physical environments. However, as with virtual disks, these are storage models abstracted away from the multiple storage protocols that can be used to implement them.
1. Instance storage: Virtual disks in the cloud
The virtual disk storage model is the primary (or only) model for storage in conventional virtualised environments. In cloud environments, however, this model is one of three.
Instance storage is a storage model and can be implemented in multiple ways. For example, instance storage is sometimes implemented using direct-attached storage (DAS) on the compute nodes themselves. Implemented this way, it is often called ephemeral storage because the storage is usually not highly reliable.
Instance storage can also be implemented as reliable storage using network attached storage (NAS) or volume storage which is the storage model described next.
2. Volume storage: SAN sans the physical
Instance storage, however, has its limitations. This has led to the development of another type of storage: volume storage, a hybrid of instance storage and storage area network (SAN). A volume is the primary unit of volume storage rather than a virtual machine (VM). In contrast to instance storage, volume storage is usually assumed to be highly reliable and is often used for user data.
OpenStack's Cinder is an example of a volume store, as is Docker's independent volume abstraction
3. Object storage: Web-scale NAS
Cloud native applications also need a home for data shared between VMs, but they often need namespaces that can scale to multiple data centers across geographic regions. Object storage provides exactly this kind of storage.
Object storage provides a file-like abstraction called an object, but it provides eventual consistency. This means that while all clients will eventually get the same answers to their requests, they may temporarily receive different answers. This consistency is similar to the consistency provided by Dropbox between two computers; clients may temporarily drift out of sync, but eventually everything will converge.
This model allows object storage to provide extremely large namespaces across large distances with low cost and good aggregate performance. Many applications designed for cloud environments are written to use object storage in place of NAS, because of its advantageous scale and cost. For example, cloud-native applications will often use object storage to store images, static Web content, backup data, analytic data sets, and customer files.
Many vendors provide object storage implementations, such as OpenStack's Swift, Amazon's S3, Red Hat's Ceph, and Cleversafe. All of these products speak S3, Swift, or both, often in addition to other APIs. Some existing file system vendors provide object interfaces along with their existing file interfaces, such as EMC Isilon.
All together now
Instance, volume, and object storage together provide a flexible paradigm for the cloud. While all installations may not use all three types of storage, no one type of storage can address all of the necessary use cases on its own.
These new storage models reflect the continued innovation in the modern, virtualised data center. These models also provide new freedom to both data center and application administrators. Application administrators are no longer bound to data center infrastructure decisions; they are free to manage their data in the models that make sense to their applications. In turn, data center administrators can manage storage that is important to their users, while still utilising the best storage implementations suited to their data center demands.
Brandon Salmon has been working in systems and storage for over twelve years. He has a PhD in computer engineering from Carnegie Mellon University and a bachelors in computer science from Stanford. He has worked at Microsoft, VMware and Intel in the past. As the 6th engineer at Tintri he designed and built significant portions of both the core Tintri filesystem and integrations with private cloud environments. He now works in the Office of the CTO investigating market changes and new technologies.