How the heck this storage is sustaining the data explosion?

I had assembled my first PC with 32GB HDD during my first year of engineering in 1999. What the heck has happened in last 15 years, rather last 5 years. Cloud Services are expanding, with Security landscape evolving and BYOD adoption is accelerating but most importantly mobile devices and data is exploding. IDC studies shows 200 billion sensors would be in market in next 5 years.

IDC prediction 2015 says that the greater cloud market is forecasted to be $200B by 2018. Seriously where the heck this data is coming from? Is cloud in the sky dropping this data, but where in earth we are storing this data to have reliability, scalability and cost effectiveness.

Gartner projects 40% growth in Global data generated YoY. It is mostly an unstructured data (no tables/ database, no relational components) Object Storage is changing the game (it has it’s own challenges which I will describe in my next post), but let’s start with the basic fundamentals of traditional storage system.

In its most basic form, think of block level storage as a consolidated hard drives (arrays) in a server and is accessible using Fiber Channel or iSCSI. Block level storage is usually deployed in SAN (storage area network) environment, and controlled by Operating System and leverages iSCSI protocols.

File Storage (directory hierarchy) manages the layout, structure of the files, and directories on the physical storage attached through IP network. NAS (Network Attached Storage) is a file system, which uses basic File Transfer Protocols (FTP), [Microsoft SMB/ CIFS- Server Message Block] to support file sharing, and other communication interface. Protocols can be used on TCP/IP for copying, transferring to a file server. In the simplest form imagine it as having a one central file server (or Network Attached Storage Device) containing all the files in a hierarchy, and all connected devices access data through network (say LAN). These network appliances may contain logical redundant storage containers, or arrays called RAID. Of course you can configure RAID by mirroring and striping with parity. Please don’t be confused with DAS (Direct Attached Storage) which typically is an extension to and existing server (not necessarily be networked).

Amazon S3 shifted the game with Object Storage System. Instead of organizing files in a directory hierarchy, object storage systems store files in a flat organization of containers, and use unique IDs and metadata to retrieve them through http protocol/ interface. The beauty of object storage is that endless storing capacity with scale out by adding nodes (could be geo replicated and dispersed). Minimal functions include store, retrieve, copy, and delete files (not as extensive). Object storage allows inexpensive, scalable, redundant, reliability (by object replication) and self-healing retention of massive amounts of unstructured data as 90% of the current data is in unstructured form. So now you know the format in which your Facebook photos, or your music have been stored in public cloud/ storage systems.

[Image (both) Credit: Object Storage by SwiftStack at]

Tech Target’s Margaret Rouse describes Object Storage as valet parking. When a customer uses valet parking, he exchanges his car keys for a receipt. The customer does not know where his car will be parked or how many times an attendant might move the car while the customer is dining. In this analogy, a storage object’s unique identifier represents the customer’s receipt.

Let’s build on these fundamentals and move to storage solution in OpenStack with Cinder and Swift on my continued writing…


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s