Monday 14 April 2014

Data Storage Placement Host-side or SAN-side: Which side are you on?

Storage Magazine Article: Which side are you on?

DataCore's Augie Gonzalez considers both sides of the storage placement argument and concludes that maybe we don't have to take sides at all

There is a debate raging as to where data storage should be placed: inside the server or out on the storage area network (SAN). The split between the opposing views of the network grows wider each day. The controversy has raised concerns among the big storage manufacturers, and will certainly have huge ripple effects on how you provision capacity going forward.

20 years ago, SANs were a novelty. Disks primarily came bundled in application servers - what we call Direct Attached Storage (DAS) - reserved to each host. Organisations purchased the whole kit from their favourite server vendor. DAS configurations prospered but for two shortcomings; one with financial implications and the other affecting operations.

First, you'd find server farms with a large number of machines depleted of internal disk space, while the ones next to them had excess. We lacked a fair way to distribute available capacity where it was urgently required. Organisations ended up buying more disks for the exhausted systems, despite the surplus tied up in the adjacent racks.

The second problem with DAS surfaced with clustered machines, especially after server virtualisation made virtual machines (VMs) mobile. In clusters of VMs, multiple physical servers must access the same logical drives in order to rapidly take over for each other should one server fail or get bogged down.
SANs offer a very appealing alternative - one collection of disks, packaged in a convenient peripheral cabinet where multiple servers in a cluster can share common access. The SAN crusade stimulated huge growth across all the major independent storage hardware manufacturers including EMC, NetApp and HDS and it also spawned numerous others. Might shareholders be wondering how their fortunes will be impacted if the pendulum swings back to DAS, and SANs fall out of favour?

Such speculation is fanned by the dissatisfaction with the performance of virtualised, mission-critical apps running off disks in the SAN, which lead directly to the rising popularity of flash cards (solid state memory) installed directly on the hosts.

The host-side flash position seems pretty compelling; much like DAS did years ago before SANs took off. The concept is simple; keep the disks close to the applications and on the same server. Don't go out over the wire to access storage for fear that network latency will slow down I/O response.
The fans of SAN argue that private host storage wastes resources and it's better to centralise assets and make them readily shareable. Those defending host-resident storage contend that they can pool those resources just fine. Introduce host software to manage the global name space so they can get to all the storage regardless of which server it's attached to. Ever wondered how? You guessed it; over the network. Oh, but what about that wire latency? They'll counter that it only impacts the unusual case when the application and its data did not happen to be co-located.

Well, how about the copies being made to ensure that data isn't lost when a server goes down? You guessed right again: the replicas are made over the network.
What conclusion can we reach? The network is not the enemy; it is our friend. We just have to use it judiciously.

Now then, with data growth skyrocketing, should organisations buy larger servers capable of housing even more disks? Why not? Servers are inexpensive, and so are the drives. Should they then move their Terabytes of SAN data back into the servers?

For many organisations, it makes perfect sense to have some storage inboard on the servers up close to the applications, augmented by some externally shared storage on premise, and the really bulky backups in the public cloud - especially those requiring long-retention. The problem for those taking sides is they refuse to accept the other alternatives. And so it's all about picking one location versus the other.

What if, instead of choosing sides, one designs software to leverage storage assets in all three places: the Server, the SAN and the Cloud, eliminating the prejudice over location? Organisations are then free to route the requests where most appropriate, and put the network to good use when it's beneficial.
Techniques like infrastructure-wide automated storage tiering put these principles into action. Really active blocks stay close to the programs servicing the requests from local flash storage, whereas infrequently used data gets directed further away over the wire.

In-memory caching plays a critical role keeping everything going smoothly. It leverages the super high speed DRAM close to the app, to mask potential downstream delays from slower hardware.
Software capable of exacting dynamic control over caches, storage placement, replicas, and thin provisioning - that's where all the intelligence come into play, and the key to distributing data appropriately across all 3 locations (server, SAN and cloud).

Don't push me to take sides. Different situations call for different approaches, that's why the industry has progressed this far. One thing is for certain, it's a debate that is set to run and run.

More info: 

No comments: