If storage were invented today, here’s what it would look like

Wed, 7th Dec 2016

FYI, this story is more than a year old

Let us picture the ideal storage platform if the technology were invented today, without limitations, pre-conceived ideas or restrictions.

Absolute reliability: Data represents every company's crown jewels, and its availability is critical to all organisations. Storage should be decoupled from data. Data should move freely and ubiquitously across all storage platforms to deliver the performance, capacity and protection essential for business.

In an ideal world, there will be multiple instances of live data in multiple locations. At some stage hardware will fail, so resilience should be built into the software platform, rather than the hardware.

Performance: The needs of an application should dictate where data resides. Data should be totally independent of storage hardware and free to move ubiquitously across any and all of an organisation's storage products, depending upon the application and users' needs.

These performance objectives should be based on either its required IOPS or the latency needed. Data that is hot, or actively being written to or read from, should reside in memory or DRAM. As it starts to warm, it should be moved automatically and in real-time to solid-state disks (SSDs), and as it cools, it should be moved to slower SSDs and finally to slower spinning disks or the cloud.

Eventually all data becomes cold, meaning it is unlikely to be accessed again. That type of data is ideal for high capacity, low cost 6TB, 8TB or 10TB drives (A$700 each), or even stored in the AWS, AZURE or Google cloud on a low cost monthly subscription.

This is known as real-time data tiering. Wherever the data resides at any point in time it must always be searchable and recoverable, and moved to the hot tier when it is needed.

How much high performance storage? In most businesses, typically between 2 – 10 per cent of data is hot, and ideally this should reside in memory (RAM), DRAM or SSD, ideally on the bus so that latency is at the minimum. On average, 10 – 20 per cent of data is warm and this should reside on SSD or fast spinning disk, while the remaining 50 – 80 per cent of all data is cold and can reside on the cheapest disks or in the cloud.

Remember, even though data is sitting on the cheapest disks, a company will have multiple live instances in multiple locations, providing incredible redundancy. For a critical application, perhaps there will be three live instances locally, two in a remote data center, and maybe one live instance in each of AWS, AZURE and Google.

All storage in all locations would need to fail at exactly the same time – and in that case, your data would probably be irrelevant as a major catastrophe would have occurred. For a non-critical applications, a good mix is two locally, and one in a remote data center or in AWS, AZURE or Google Cloud.

Deploy and manage with inbuilt self-healing: The ideal storage platform would be deployed in minutes and require minimal technical expertise to manage. All operations would be fully automated with minimal ongoing user involvement or engagement required. As data is no longer tied to the storage hardware, all data movement is fully automated by using the latest in artificial intelligence.

If a problem develops, for example, a unit of storage fails, the platform automatically self-heals by creating a new live instance of data residing on that failed storage device, at that specific point in time.

The new live instance is created on available storage that matches the objectives for that data, at that point in time (whether that data is hot, warm or cold). All storage is managed by simply setting the objectives (performance, protection and capacity) for that application. Simple. Easy. Powerful. Intelligent.

Accept all types, vendors, makes or models: In the new world, all existing and future storage should be usable by any data in the business, depending on objectives set by the application. That means all RAM, DRAM, SSD, SAN, NAS, DAS, JBOD, hyper-converged, all-flash arrays and even cloud assets are seen as a single storage pool, fabric or platform and shared across an entire organisation.

No matter whether a business has DELL-EMC, HPE, IBM, Lenovo, Tintri, Pure Storage, Nimble, QNAP, Seagate or Western Digital spinning disks, Samsung, Intel, Diablo or the latest nVME or 3DcrosspointX3 SSDs – all storage devices are available to all data, depending on its characteristics and application requirements, at any specific point in time.

Mobility and Flexibility: At different points in time, data requires different storage. Two excellent examples are during a VDI Boot Storm and running memory-intensive reports. In the new world, at 07:30, all the previous day's images for the VDI Boot Storm are pre-loaded into memory, DRAM and the fastest SSDs – the result is faster logon times and increased user satisfaction.

As soon as the Boot Storm finishes, for example, at 09:30, all that high performance, high cost storage is freed up for other applications to use – such as reporting applications. This enables more efficient utilisation of high performance, expensive storage assets to deliver exponentially better results for the business. Of course, these entire processes should not require any user involvement.

Cost: In the ideal new world, cost should be replaced by investment. Moving forward, there should be only two major investments in any future storage platform: the Objective-Defined Storage Platform Software licensing investment; and the cost of actual storage.

All new storage should be commodity, high capacity inexpensive storage / cloud for capacity or high performance RAM, DRAM or SSDs performance. After all, data should move freely and ubiquitously across all storage platforms to deliver the performance, capacity and protection essential for business. Storage hardware is just housing data as it moves through the network.

The intelligence, performance, reliability, scalability, mobility, management should be controlled by an Objective-Defined Storage Platform software that drives the data across the multiple storage hardware depending upon the objectives that have been set at the application level.

As a rule of thumb, Objective-Defined Storage Platform software should range in price between $50 to $20 per month per TB, with the investment scaling down as the capacity increases.

Value: Think about how much value you will gain if you deploy 2TB super high performance Samsung 960 SSDs (approximately $A1,800) which deliver in the region of 3,500MB/s seq. read speeds, 2,100MB/s seq. write speed, or 1TB Intel 6000p SSDs at around $A503 on each host that can deliver 1,800MB/s Seq. Read, and Up to 560 MB/s Seq. Writes. How much will your performance increase by?

Future-proof storage: How will storage infrastructure change once Samsung launches its 32TB SSDs, or when Seagate releases its 60TB SSD or Toshiba have their 100TB SSD – all due out in 2017? With Objective-Defined Storage, they are simply added to the fabric, absorbed, and able to be shared and used across the entire infrastructure without changes. How much will 60TB or 100TB SSDs reduce electricity costs? Cooling costs? Rack space requirements?

With Objective-Defined Storage, users will be able to use all new storage from day one, sharing the capacity and performance across their entire infrastructure.

Finally let us consider what organisations would gain by moving to an Objective-Defined Storage platform:

No vendor lock-in – having to buy storage from a vendor at inflated prices
Remove storage silo's that don't mix together with other storage vendors
Decrease complexity – storage that is complicated and difficult to deploy and manage
Eliminate expensive products – as storage should simply be just be storage. High performance is more expensive than high capacity.