Wednesday, December 21, 2016

On Disruption

A few months ago, there was an email thread at my employer asking the question if All-Flash Storage was a “disruptive” technology. Disruptive, in the business sense, refers to Clayton Christensen’s definition of the term from his book, “The Innovators’ Dilemma”.

This, from a year ago, Christensen reviews his concept:

What Is Disruptive Innovation?

However, I think this is a narrow, and perhaps obsolete definition. He says Uber is not disruptive, because it did not originate in the low-end or new-market segments. However, while Uber did not disrupt car for hire, it did disrupt the capital model of cars for hire, and it did disrupt the medallion licensing model. Then the article also talks about how Netflix, in its original format (DVDs by mail) attacked an underserved periphery—not the low end, and not a new segment—of the market.

If we use the pure Christensen definition, All-Flash Arrays (AFAs) are not disruptive, but HyperConverged Infrastructure meets the definition. But perhaps we should look more broadly at the definition.

“The Innovator’s Dilemma” is 20 years old. It was written during the Dot-Com boom. Business books are not canonical. If they were there would never be revisions and follow-ons.

I think we need to take a wider view of disruptive technologies. Uber disrupted car for hire capital and licensing models. Driving an Uber is much less expensive than buying a taxi medallion, so the cost of entry was disrupted.

So how does that apply to AFAs? We know cost of IOPS is much lower with AFAs. We also know the costs of sizing and performance management dramatically decrease. One can argue the TCO of AFAs is lower. While AFAs did not enter at the low-end or a new-market segment, it did enter at a periphery, at a market segment (high transactional performance storage) where it offered a lower cost. AFAs disrupted a market segment of the overall frame storage market. Not the Mainframe attach segment, and not the extreme reliability segment, but at the assured high performance segment.

But here is another aspect of AFAs I am seeing—they mandate changes to a customer’s operational model. AFAs were made cost effective in part by using data reduction technologies (deduplication and compression). While there were some hard-drive based storage arrays which leveraged data reduction technologies (NetApp FAS, EMC Celerra, Sun/Oracle ZFS based arrays), these data reduction technologies were not available on high-end frame storage (EMC Symmetrix/DMX/VMAX, HDS USP/VSP, IBM ESS/DS8000). These data reduction technologies worked well for certain workloads: virtual machines benefited from deduplication, and OLTP databases benefited from compression.

This meant AFAs with built-in data reduction, targeting small, peripheral workloads (VDI, high-transaction OLTP), were set up for easy success.

However, at the same time other trends were occurring. To more effectively leverage the expensive high-end frame storage, some DBAs were turning on compression within their database software. Yes, this increased the number of CPUs needed to run the database, and increased their cost, but often DB licensing was a sunk cost. It was also possible to compress at the OS/filesystem level. It was not unusual in organizations where IT departments charged back storage capacity to users, for users to turn on compression in their servers to reduce their chargeback.

The second thing that happened over the last five years has been the fear of a data breach. This has driven the need to encrypt data at rest. While storage arrays offer this capability through Self-Encrypting Drives, encryption boards, or software encryption running on the array’s controller, often enabling storage encryption could only be done after upgrading the storage array to a new model. As a result, turning on encryption at the application level (i.e., the database), at the OS level (encrypting file systems), or at the VM level (using products like HyTrust) was a much faster path to security for many customers. Also, customers were assured only host level encryption ensured data was encrypted “over the wire” in addition to at rest.

The result of either of these technologies is it eliminates ability of the data reduction technology in the storage array to provide any benefit, and it returns the cost per gigabyte of flash storage to what it was with early generation, non-efficient architectures, which ultimately lost out to the AFAs with built-in data reduction.

The only way to benefit from an AFA’s data reduction features are to ensure applications and operating systems are not running host level compression or encryption. It may mean ripping out products like HyTrust and Vormetric. It may mean internal battles with DBAs. It may mean new terms and conditions in internal SLAs and storage chargebacks. The All-Flash Data Center sounds innovative on paper, but implementing it means working across traditional IT divides of applications, servers, security, and storage.

There are some data types which are natively compressed. For example, all the current Microsoft Office file formats are compressed. Additionally, most image files are compressed. Traditional file shares full of PowerPoint files are not going to benefit from AFA data reduction. Generally these workloads have never rated high-performance storage, and because of the lack of reducible data, it will take more time for the cost per gigabyte of All-Flash storage to come down to a point to provide the necessary payback to justify migrating these workloads to flash.

Why did I go down this path? It was to point the potential limits of a disruptive technology. When AFAs were narrowly applied to certain workloads, there was a cost-benefit which accelerated their adoption. When they are applied more broadly, they hit organizational barriers to adoption. Perhaps these barriers mean AFAs do not fit the definition of a disruptive technology. However, in IT I see many “disruptive technologies” which ultimately force significant operational changes on IT organizations. That was true for UNIX, Storage Area Networks, Windows, Linux, and VMware. It will likely be true for All-Flash Storage, Software Defined Networking, and adoption of Cloud Computing.