Massive Clustered Storage. Spectra Logic & Caringo Swarm

Massive Clustered Storage

Right from the very start, we at ACC have always understood the importance of designing scalable, stable and resilient storage solutions. The last ten years has all been about Storage Area Networking, so block based clustered storage, presented to servers either by iSCSI of Fibre Channel. The SAN has made a huge impact on todays datacentres, if you ever find your self walking down those noisy corridors, you will see racks and racks of Dell Compellent, 3PAR, NetAPP and Oracle Pillar.

We were one of the very first UK Partners to pass certification for LeftHand Networks iSCSI SAN solutions, and today, although having been acquired by HP, the virtual edition of the SAN IQ solution, is a great solution which provides much of the feature set of the 'big' players mentioned above, but for an extremely competitive price - not just 'scale out' in architecture, but with the VSA solution, very much a cost scalable solution too. We have designed and deployed many boot from SAN HP Store Virtual solutions and have a significant amount of experience in maintaining and managing this SAN solution, including it's integration with Oracle VM.

Many customers, especially the larger ones, are hitting the physical limitations of working with large LUNs, block re-allocation issues, or migrating snapshots on a regular time schedule, but with variable data transport patterns. On top of all of this, being a block level solution - the file system is pretty much ignored, so there is nothing that can be done to help with the the organisation and 'searchability' of files residing on a typical block level SAN solution.

And so we turn our attention to an exciting new chapter in building massive storage clusters, object based solutions, such as Caringo's SWARM solution.

This is not designed to be a replacement for a SAN, lets be clear about that. Many SAN's were originally purchased on the promise of being 'Scale Out' in that more and more nodes can be added as storage requirements grow, and as compute power and storage is added at the same time, the performance of the entire stack will increase with every node added to the cluster. The promise may have been true, but the reality is that it's an expensive promise to keep, especially if most of the data taking up your SAN capacity if infrequently accessed or legacy 'cold' data. So a large intelligent object store, will give you the ability to offload much of what can be identified as 'cold' data from your SAN, freeing up large amounts of space, which can be put to much better use running VMs or perhaps databases with higher performance requirements.

Two Object Solutions, One Objective: Cut The Flab From Your 'Cab!

As you would expect being a heavily researched focused operation, we have spent and immense amount of time in researching and testing a number of object storage solutions which we could then recommend to customers. As part of this research two distinctly different scenarios came out so we set about finding the ideal solution for each one.

Scenario 1: Spectra Logic & Tape - The Sequel

Move Cold Data to high security archive, either disk ('darkive') or (and) tape, for backup, compliance and cost reasons. Our research told us that there is only one company out there that can deliver a solution, which has radically altered peoples perspective of tape, a storage medium many thought was dead and buried many years ago. Spectra Logic, in their Black Pearl S3 Gateway, and Arctic Blue, their super dense disk darkive solution, enables easy and seamless transition of large amounts of data to be stored and filed in an orderly indexed fashion, with automatic data retention policies to comply with data protection laws. Coupling up a Black Pearl solution with Arctic Blue (for faster access) whilst at the same time sending to one of their vast tape libraries - there can be no more cost effective way to store massive amounts of data.

Scenario 2: Caringo Swarm & Disk - Say Goodbye To RAID

As disk sizes have increased over the years, it's no longer commercially sensible to kiss goodbye to such a large amount of disk space for parity purposes only. Couple that up with all the performance issues with lower RAID levels 5 & 6, and it's clear, for massive storage, with RAID, there is massive wastage. Enter Caringo with SWARM. An object store based on erasure encoding, data itself is split up in to parts and it's at this level, not subsystem, that redundancy is performed. So many more disks can be lost, per cluster, and much more data can be stored per cluster, using this type of solution.

There are many levels of erasure encoding, which should be chosen depending on the level of data redundancy that's required - but a popular ratio is 5:2 - 7 parts in total, 5 data 'parts' to 2 parity 'parts' - and (interesting fact alert!) the 'Yotta' part of our logo reflects this with 5 solid and 2 white boxes.

Caringo SWARM is an object store than can be completely provisioned on your own hardware, and can even be provisioned inside Azure. As it's priced on a per TB of storage use, it's extremely cost effective to get started. The solution replicates to multiple sites and also provides a certain amount of version control. So it makes an excellent distributed object store.

So we would say that if you are looking for more of an archive solution, are starting with a requirement to store a huge amount of legacy data, and retrieval times for most of this data was not mission critical, Spectra Logic's solution would be seriously worth a look.

If you do not need / want a tape component, have have access to existing or competitively priced JBOD server storage, you are starting with smaller amount of data, and want to be able to replicate it around, plus tie in with your existing windows file servers (Caringo FileFly) - and you want to be able to perform a full POC before buying - then in this scenario, we would recommend a solution based on Caringo product.

Massive Clustered Storage: Block, Byte Or Object - Our Name Is All Over It