I was asked by a vendor recently why customers would purchase another vendor’s storage solution when theirs could offer both the functionality and performance at a lower price. Whilst there can be many answers to such a question, and I did supply several, the one that got me thinking was ‘Because they know it, they like it and because it has never let them down’. In other words the same reason that the chap asking the question was on his 5th BMW, he liked and trusted the brand.
However, in the wonderful world of data all you need to understand is that things ain’t what they used to be. More than perhaps ever before, success, maybe even survival, may depend on a company’s ability to cultivate loyal, maybe even devoted, repeat customers. The thing about loyalty is that it isn’t always as pure as we’d like it to be. Sometimes (ideally) it’s earned, but sometimes it can be all but forced upon us for technical or commercial reasons, or possibly the disruption is too excessive to consider change.
Storage vendors have become very good at offering competitive upgrade programs designed to retain the customer, and in general these work very well for both the vendor (who retains the footprint) and the customer (who gets a commercially compelling deal). Equally vendors can offer deals to replace another technology with their own, whilst enabling ease of migration between platforms. But therein lies the challenge; technology is always unique to a single storage vendor; whilst the underlying disk technology may be the same the connections and access methods are invariably incompatible.
Now, it’s very possible to carry out data-in-place upgrades to controllers, with the existing disk technology remaining in place, with this type of solution available from the majority of vendors, and that does encourage a sense of loyalty to vendors. Obviously the decision to change technology in an environment is a hard, or brave, decision to make, and certainly not an easy one. Therefore in many circumstances it simply becomes easier for customers to stick with what they have in place.
However, as I said earlier, times they are a changin’, and the world of data and storage is changing more rapidly than other areas, and these changes may make it harder for incumbent vendors to retain their existing customer base. As we enter the world where software is king and everything assumes the software defined banner then suddenly the need to stick with a previously preferred vendor disappears. Going further, by removing the intelligence previously provided by the disk controllers to a software layer not only removes the need to have loyalty to a vendor, it also removes the need to be tied to dedicated block, file or object based arrays and removes the need to be tied to specific features of an existing array.
New and emerging vendors are already using commodity components and are providing their USP and value through their unique software, but even this can become compromised as the software layer evolves.
It’s a challenge that all existing storage vendors will have to face; this rise of commodity infrastructure in all its guises is coming fast and not stopping, and whilst this applies to all infrastructures it’s at the data layer where the most radical changes may occur.
As always, it’s a really interesting time to be working in storage and data.
Data is all around us. But does it inform us, entertain us, connect us, influence us and divide us? Or is it danger of consuming us?
We all know data is growing rapidly, and all kinds of statistics exist about creating more data in the last two years than in all of creation before that time. If all words spoken by every human who ever lived were committed as text it would consume just over 2.2 Exabytes of capacity, in 2014 we will create more than 20x this figure.
On its own much of this may be a problem or it may be nice to have.
Let’s consider some raw data… 12. On its own it means nothing to us, other than it’s a number, some raw data.
What if we tie it in with another piece of data OC, then we know if we join two separate pieces together we have a temperature, still really not much use to us.
But consider if we add another two elements of potentially unrelated data; London & Tuesday, then suddenly we have
12OC London Tuesday
Now we are able to use the data to make an information decision; we know what to wear, how to travel and what to carry. Several pieces of data used together now have relevance, they have become information.
Let’s take another piece of raw data, what do you think of if I give you the word ‘Bridge’ – a river?
What if I add Playing cards, suddenly you think of a card game
What if I change cards to dentist? Something else again?
The second part of information is context, without context data is no use to us.
For data to of use we need both relevance and context, when we have both we can use the information to make informed decisions. We can use it in four ways;
- Descriptive –to understand what’s happened
- Diagnostic –to understand why it’s happened
- Predictive- to forecast what might happen
- Prescriptive – to know what to do when it does happen
Sources of Data without context & relevance are of no use, data with context and relevance can be used together in ways we’ve not yet dreamed of.
Remember the Most valuable currency in the world is not money; it’s not data, its information.
Multiples of bytes
|Orders of magnitude of data|
Congratulations to anyone that spotted the above to be a quote from the third President of the USA, Thomas Jefferson, and although he may have said it in 1803 the relevance remains today.
I’m no longer sure which generation I belong to, I come from an age when disk drives could be measured in Megabytes, nowadays we don’t talk in Gigabytes and some of us don’t even talk inTerbytes any more. We know data is exploding and we know technology develops to cope with this; however that’s evolution not revolution.
I believe we are at the cusp of the next revolution in technology. To be the next big thing has to fundamentally change how we do things. It has to change how we look and think about our world; it has to be revolutionary.
It used to be that we got excited by individual pieces of technology; maybe our first laptop, maybe our first 1Tb drive, maybe our first smartphone which you just love to hold, and be seen with.
But whilst these may be considered revolutionary, they remain point solutions – they are single dimensional.
We’re moving into a multi-dimensional world of IT. We’re moving from single dimensional solutions to Multi-dimensional solutions
- Where everything has an impact on everything else
- Where every piece of technology has to interact with everything else
That’s just in business, what about the personal world, where your smart phone has to interact with your car, which has to interact with your microwave, which has to interact with your television, so when you get home everything is in its place. How do you choose? And more importantly how do you control it all?
The problem with multi-dimensional solutions is that there so many choices to be made. We are seeing the start of this wave now in the ‘Software Defined’ world, where it gets harder to identify components of a solution, but really why should we care anyway?
So what do we do in this multi-dimensional, software-defined world of IT?
- Should you ignore everyone and continue as you are, after all it works doesn’t it?
- Maybe putting everything in the cloud and consuming as a service is the answer
- Why not adopt all the new methods, be seen to be progressive but continue to do everything the same old way?
- What if you adopt every new solution out there, change all your processes to get all that benefit you’ve been promised? How much disruption would that cause? & what if it doesn’t deliver?
It is a minefield out there, and as with all minefields it’s always good to have someone with experience to guide you through it. This is where Computacenter come in.
Every generation has an obligation to renew, reinvent, re-establish, re-create structures and redefine its realities for itself. Get ready for the next generation.
EMC World 2013 took place in the Venetian Hotel and Sands Conference centre on 6th-10th May 2013. Attended by over 12,000 staff, partners and customers there were several product announcements and a range of upgrades to existing technology. The main points of interest were as follows;
- ViPR (pronounced Viper); The major announcement. EMC’s entry into the world of Software Defined Storage. ViPR will be (initially at least) an appliance designed to abstract the control plane and data plane. The control plane will effectively be a storage hypervisor, managing the storage (data plane) underneath, which on day one will be EMC’s VNX, VMAX or Isilon and any NetApp arrays, other vendors to follow. The Data plane can be commodity storage in the future. First products due to ship late 2013, so final verdict is reserved until then. Initial release sounds very much like a Gen 1 product, so expect push back from other vendors, but the roadmap sounds fairly compelling, and comes under the “product to watch” category”. Rumour has it that EMC belief this to be their best Gen 1 product yet released, and is their future. ViPR will offer pooled storage resources presenting Block, File and Object based presentation and include simplified management and automation. Full review in a separate post.
- Pivotal – Announced before EMC World, but had a lot of focus, Pivotal is a partnership between EMC & VMware with GE investing heavily, this is designed for next generation Cloud and Big Data applications. Pivotal splits into three areas; Data Fabrics, Application Fabrics and Cloud Fabrics. Pivotal 1 launched late 2013, again one to watch
- XtremIO – Available now in limited quantities but a big focus. EMC’s All-Flash Array (AFA). Provides a lot of the functionality expected of Enterprise class arrays, combined with very high performance. Want to see one? Contact me, I’ve got one!
- EMC Velocity Partner Program – the partner program changes to allow all partners to be “Business Partners” with specialities in relevant areas. Look out for Computacenter changing from one “Velocity Signature Solution Centre” logo to about 20 different Business Partner logos. Those PowerPoint slides suddenly got very busy.
- Isilon upgrades – Isilon is proving to be an excellent acquisition for EMC, look out for forthcoming enhancements including deduplication, auditing ability and integration with HDFS, combined with additional scalability. Also the required enhancements to the SynqIQ replication functionality are being delivered.
- SRM Enhancements – New suites of management products, sharing a common interface with ViPR. Let’s face it – these were needed.
- Continuous Availability enhancements – The ability to combine VSPEX + VPLEX is designed to eliminate complexity in this area for relevant customers
- VNX upgrades are on the way, but still under NDA (if you are internal ask me nicely)
- BRS (Backup & Recovery Services) – Enhancement to the Data Domain range, with further development in Avamar technology means this remains a focus areas for both EMC and partners.
Summary; EMC World remains one of the Must-Attend events in the industry. Whilst some of the announcements are of future products which are work in progress, theses do give an insight into the direction the company is going. Joe Tucci stated that EMC will remain true to its roots, but with an increasing investment in software based products. EMC World proved a worthwhile investment in time.
As 2011 was a year of us talking about “Cloud”, closely followed by the “Big Data” wave of 2012 then 2013 is shaping up nicely as the year of the “Software-Defined” entity, where multiple technologies are being covered by the “SDx” banner. Let’s have a brief look at what this means for the world of storage.
In the world of data we are used to constants; Controllers that manage the configuration of the environment and the placement of data, disks grouped together using RAID to protect data and the presentation of this data to servers using fixed algorithms. In effect when we wrote data we knew where it was going and could control it’s behaviour, we could replicate it, compress it, de-duplicate it and provide it with the performance level it needed, and when it needed less performance, then we just move it somewhere else – all controlled within the storage array itself.
Software defined Storage changes this model; it can be thought of as a software layer, put in place to control to control any disks attached to it. The storage services we are used to (snapshots, replication, de-dup, thin provisioning etc) are then provided to the Operating System from this layer. This element of control software will be capable of sitting on commodity server hardware, in effect becoming an appliance initially at least, and will be able to control commodity disk storage.
This does not really constitute some of the features of storage virtualisation, where a control plane manages a number of storage resources, pooling them together into a single entity; rather it separates the management functionality removing the need for the storage controllers – the most expensive part of a data solution. Therefore one of the driving factors for the uptake of Software Defined Storage is an obvious reduction in cost, and the ability to provide data service regardless of the hardware you choose.
The challenge to this is that data should be regarded differently to other aspects of the environment; data is permanent, packets traversing network are not, and even the virtual server environment does not require any real form of permanence. Data must still exist, and exist in the same place whether power has been present or not. We are now starting to see a generation of storage devices, note I was careful not use the phrase arrays, which are looking more capable of offering a Software Defined storage service, through the abstraction of the data and controller layers.
So what does this all mean for storage in the datacentre?
My main observation is that physical storage arrays will be with us for a long time to come and are not going away. However the potential for disruption to this model is greater than ever before, the ability to use commodity type storage and create the environment you want is compelling. With the emerging ability of software to take commodity hardware, often from several vendors simultaneously and abstract the data layer then the challenge to the traditional large storage vendors becomes a real and present danger.
I believe the rate of change towards the software defined storage environment will ultimately be more rapid and see greater early adoption to the proven concepts of server virtualisation, it will cause disruption to many existing major vendors, but ultimately end-users will still require copious amounts of disk technology, so the major players will remain exactly that. Whilst some niche players may make it through the big boys will still dominate the playground.
“ Data is the new oil”
“The most valuable currency in the world is not money, it’s information”
– A couple of great quotes written by people much more eloquent than me. However I do have one of my own ;
Data is the new rock’n’roll
Just as rock’n’roll transformed music scene the use, and future potential use, of information is dramatically changing the landscape of a data centre. Historically the storage array was effectively the drummer of the band, required but sitting fairly quietly in the background, and whilst a vital component it was not necessarily the first thing people thought of when putting the band together. Even now, if you look at a picture of any band, the drummer is the one hanging about aimlessly in the background, try naming the drummer in any large and well-known bands; it’s much harder than you think. And so it was with storage and data; the storage array would sit somewhere towards the back of the datacentre whilst the shiny servers were the visible component, and the items that got the most attention.
As we hit 2013 that all changes; the storage array is the Kylie of the datacentre, it’s the sexiest piece of equipment in there. And so it should be given that upwards of 40% of a customer’s IT budget is spent simply on provisioning the capacity to house data.
At Computacenter, we’ve made a large investment in our Solution Centre. Whats sits in the front row now? Of course it’s the data arrays; with the latest technology from EMC, HP, HDS, IBM and NetApp all showcased. Why is it front row? Obviously as it’s the most important component of any solution nowadays. And of course, it looks sexy, or is that just me?
The storage array is now front and centre, it’s the first component to be designed when re-architecting an environment. Why? Simply because a customer’s data is their most valuable asset, it’s transforming the way people do business; it’s changing the way we interact with systems and even each other, your data is now the lead singer in the band.
Data is the one thing that is getting attention within the business; it’s the one thing you have making the front pages of “Heat” magazine – Where’s it going? What’s it doing? Is it putting on weight? Is it on a diet? What clothes is it in? Should it be in rehab? But as the manager of the data (or the band) there is one simple question that you want answered; how do I make money out of it?
And that, dear reader, is the $64,000 question. The good news is that is becoming ever more possible to use your data as a revenue generation tool, we are only starting to see business value being generated from data, as 2013 progresses we will see some niche players mature (and possibly be acquired), we’ll see an increased push from the mainstream vendors and we’ll start to see ways of manipulating and using data that we just couldn’t contemplate when the storage was simply providing the rhythm section.
Even converged systems, the boy bands of the decade, which perform in harmony always have one better singer than the rest, well he’s the data
So: Compute, Networking, and Software, the gauntlet is down; Data is the new rock God, it’s the Mick Jagger to your Charlie Watts, you want the crown back? Come and get it, but for now it’s all mine.
All the data architects out there can join me as I sing (with apologies to Liam & Noel) “…Tonight, I’m a rock’n’roll star!”
As a follow up to my recent blog “Cut me – I bleed data”, where I looked at the potential for DNA storage, I thought I would look at how the human body can create data, and how it can be used for our benefit. We are all used to the concept of pedometers; where a small device carried on the person counts the numbers of steps we take in a day. I’m fairly sure all the devices I’ve tried are faulty as it must be more than 300 steps from home to car to office to desk to coffee shop, right? Walking 10,000 steps per day is good for your health apparently, so I may be a little bit short of my daily target.
However a few things caught my eye recently; the first two are very similar – the “Fitbit” and Nike fuelband, both work in similar fashion and take the pedometer concept to the next level. These devices have the same basic aim; to encourage us to lead a healthy active lifestyle and to monitor our progress and feedback in a way that is of benefit to us. They can track our steps, distance travelled, calories consumed and can measure if we are climbing stairs. We can use the App provided on our smartphones, tablets or any other device to input the food we consume and track our goals graphically if we want.
Ever woke up tired in the morning, wanting just another 5 minutes? Well, the next interesting thing they can do is measure how we sleep and what our sleep patterns are; this can then be used to wake us gently in the correct sleep phase to ensure we are ready for the day. Without thinking about it you are slowly building a database about yourself, we create the data and use the instrument to record it, and you wondered where all the growth of data you keep hearing about is coming from? Some of it is your fault, I’m afraid.
That’s all data generation we can control, we choose to wear the device, download the data wirelessly, stand on the wireless scales and transfer information about ourselves, but what about things we would like to control but really not sure how to? What if we wanted to measure heart rate, brain activity, body temperature and hydration levels and rather than having our own database we wanted to share it with our doctor or consultant? We’re not too far away from reaching that stage.
An American based company has piloted the concept of stretchable electronics products that can be put on things like shirts and shoes, worn as temporary tattoos or installed in the body. These will be capable of measuring all the criteria above. Another company will begin a pilot program in Britain for a “Digital Health Feedback System” that combines both wearable technologies and microchips the size of a sand grain that can ride on a pill right through you. Powered by your stomach fluids, it emits a signal picked up by an external sensor, capturing vital data. Another firm is looking at micro needle sensors on skin patches as a way of deriving continuous information about the bloodstream.
The data generated by this technology could be used for Business Intelligence purposes in the healthcare markets, it could be shared between yourself and your doctor allowing proactive activity to occur to improve the care offered and improve efficiencies, and ultimately to reduce costs. No more waiting 7 days to see a doctor, your chosen device downloads data which can be shared with your practitioner, who in turn sends you an email recommending more exercise and more vegetables in your diet.
The ability to use anonymous data from a group of patients would allow health care providers to spot patterns over an entire population or specific geographies. For example, the need for continuous data on blood glucose levels, particularly Type I diabetes patients, has become critical in the treatment of the disease, providing impetus for monitoring devices.
If this kind of information exists for a lot of people, it is arguably folly to not look for larger trends and patterns. And not just in things like your blood count, because overlays of age, educational level, geography and other demographic factors could yield valuable insights. The essence of the Big Data age is the diversity of data sets combined in novel ways.
These technologies could be used to get people with difficult to pin down conditions like chronic fatigue to share information about themselves, this could include the biological data from devices, but also things like how well they slept, what they ate and when they got pain or were tired. Collectively, this could lead to evidence about how behaviour and biology conjure these states, and ultimately could lead to a solution to such problems.
So it’s not just businesses that can benefit from the analysis of data, individuals and the population at large are potential benefactors of the emerging ability of technology to provide analysis of seemingly random collections of data. As I hit the weekend I may not need a wearable electronic device to tell me my brain activity is slowing down or my hydration levels increase, but it won’t slow down the amount of data I’m able to generate on myself, and the contribution this data makes to my future health. Maybe I’ll be able to store my personal database on my own DNA, who knows?
I decided to clean out my home office; I’d had enough of the 56K modems lying around, and needed the space. What I didn’t expect was to find a museum of data storage concentrated in such a small space. I suspected at the time I wouldn’t need the 5.25” 720k floppy disks to upgrade to VMS v5.1 again, but who knows maybe I should keep them – so I did, along with the 2000ish 1.44Mb floppy disks and random associated hard disks. Now when I Google floppy disks the first thing that appears is an explanation of what a floppy disk is, or rather was.
Next I moved onto some more recent technology, surely I wouldn’t have to worry about throwing out USB memory Sticks, would I? Having counted somewhere around a 100 of the things lying around the house I decided that this was maybe the time that I didn’t really need 10x 64Mb sticks cluttering up space, after all my new shiny 64Gb version is now 1000x bigger.
This got me thinking about the state of the data storage market, and the changes going on. Whilst the capacity of floppy disks rose slowly and fairly consistently we have seen some spectacular changes in the storage marketplace. We got used to disk capacities doubling every 2 years, then this changed to 18 months, then suddenly the 2Gb drives became 200Gb then 400, then suddenly the 1Tb drive had landed.
It was at this time we started to expect development to slow down – after all as a wise Star Trek engineer once said “you cannae change the laws of physics, Captain” Well, you know what Scotty, actually we can and did, 2Tb drives appeared, now 3Tb are not uncommon in datacentres and 4Tb are available on Amazon.
Surely sometime disk drives have to stop evolving? Well, yes and no, they may stop evolving in their current form, but the requirements to store more and more data, and to hold it for longer and longer goes on unabated. Hmmm, what do we do now?
Well, change the form of course. When it comes to storing information, hard drives don’t hold a candle to DNA. Our genetic code packs billions of gigabytes into a single gram. A mere milligram of the molecule could encode the complete text of every book in the British Library and have plenty of room to spare. All of this has been mostly theoretical—until now. In a new study, researchers stored an entire genetics textbook in less than a picogram of DNA—one trillionth of a gram—an advance that could revolutionise our ability to store data.
Initially there may seem to be some problems around using DNA to store data; first, cells die—not a good way to lose your valuable information. They also naturally replicate, introducing changes over time that can alter the data (and whilst we accepted this on a floppy disk it’s unthinkable now). To get around this challenge a research team at Harvard created a DNA information-archiving system that uses no cells at all. Instead, an inkjet printer embeds short fragments of chemically synthesised DNA onto the surface of a tiny glass chip. To encode a digital file, researchers divide it into tiny blocks of data and convert these data not into the 1s and 0s of typical digital storage media, but rather into DNA’s four-letter alphabet of As, Cs, Gs, and Ts. Each DNA fragment also contains a digital “barcode” that records its location in the original file. Reading the data requires a DNA sequencer and a computer to reassemble all of the fragments in order and convert them back into digital format. The computer also corrects for errors; each block of data is replicated thousands of times so that any chance glitch can be identified and fixed by comparing it to the other copies.
By using these methods they managed to encode a complete book, just under 6Mb in size onto a single strand of DNA. Now, obviously this comes at a price beyond the reach of customers for now, but at the rate the data storage market moves who knows how we will upgrade our storage capacity in the future; it is estimated that a double DNA strand could encode 10 Exabytes of data or 11,529,215,046,100Mb, that’s quite a lot of floppy disks.
So, now when you hear us data guys talking about “Big Data” and not being scared by the volume element, maybe you’ll understand why.
In a few years time when you need to add an Exabyte or two to your data capacity, don’t worry – I’ve an armful right here.