As individuals we create increasing amounts of personal data, this data can be hugely valuable to businesses allowing them to turn your raw data into valuable business information. Businesses use information you provide to target both you and people from similar backgrounds with whatever product they happen to be marketing.
The interesting question is who actually owns the data we provide. Who is responsible for the data we supply? In general, people naturally assume that businesses own this data and will protect it and use it responsibly. However as we’ve seen recently this is not always the case.
Recent data breaches, with Experian in the USA being a recent example, have shown that our personal information is not always as safe as we would like to hope. We get no visibility of how our data is being used, protected or what is done with it after we willingly supply it. With constant and increasing numbers of data breaches our data becomes more vulnerable. Remember data is more valuable than oil.
There are many example of data misuse, ranging from nuisance phone calls, spam mails and unsolicited post. However, this may all be about to change. Under the forthcoming GDPR regulations businesses will become simply custodians of my data.
It’s important for organisations to realise that IT departments do not own the data, they simply provide the infrastructure to allow access to data through a series of applications. The business is responsible for the data held, and to continue to get value from it they will have to treat it differently going forward.
Businesses will need to become more transparent in their dealing with external customers, through showing what data is held, and even why it remains held, either through showing agreement to allow data to be held verbally or through the dreaded tick box.
Inevitably this will lead to a change in business processes, which is why at Computacenter we have seen a rise in demand for data masking and Anonymisation. This allows organisations to translate their data held into valuable information without the risk of items being personally identifiable.
Possibly the most important thing for businesses to do over the coming months is to start to understand what data they have, what is valuable to them and can be translated to Information, what new or existing sources of data they have and how they treat it to ensure regulatory compliance.
My data belongs to me now, I may let organisations use it in return for a service I deem of value but ultimately it is personal and belongs to me.
There’s a new sheriff in town…..
The smart office has become more common in workplaces across the country. The Digital workplace has evolved to make our workplaces more efficient and adaptable to the changing needs of users.
By incorporating smart devices such as motion sensors, thermostats, smart switches and cameras organisations can reduce energy consumption, improve staff morale and improve productivity. Since commercial buildings account for around 40% of global energy consumption embedding sensors in walls and ceilings can have significant impact on the only using resources such as lighting, heating or cooling only when staff are present.
These sensors can be connected to the company network and using visualisation techniques can provide a view of working patterns. In turn, this can lead to energy savings of between 20-40%. Whilst the cost of creating the smart office is not insignificant potential benefits for businesses can be realised in relatively short periods.
The rise and growth of these IoT devices continues exponentially and helps create efficiencies in floor space usage and space planning. These devices can improve the experience for workers and allow the creation of personalised workspaces where individual lighting and cooling can be controlled either by an App or by your smart desk.
However, this does not come without its privacy challenges. If your smart desk recognises you through RFID tagging as you approach, and creates your personalised settings you are immediately engaged and hopefully more efficient.
The challenge comes with how much your desk then knows about you. Heat and motion sensors, RFID tags and proximity sensors mean that workers are potentially under constant surveillance. Sensors can track when people are at desks, moving around, present or not present, whether individual workers are happy with this level of surveillance remains to be seen.
Concerns are starting to be raised around what data may be being collected by sensors. We come back to the privacy paradox around what people are willing to sacrifice in terms of their privacy for convenience. Most data will be collected anonymously bit that does not preclude future use for other purposes. There is a fine line between efficiency and surveillance as some organisations have found out to their cost.
We may be entering the age of the Smart Building, but it may find itself in competition with the smart human. Will the last person to leave switch off the lights? no need for that the building will do that itself. It may not be Big Brother that is watching you; it may be Big Building.
I unlock my phone by looking at it in a meaningful way; it trusts me and unlocks. Despite the rumours I’ve yet to be able to unlock with a photo.
Both my face and yours has 83 data points that technology can recognise to ensure we are actually who we say we are. So if I can unlock my phone what else can I do with my face? Over the past few years computers are becoming increasingly good at recognising faces by using these data points and by measuring the distance between them.
We’re seeing solutions come to market to provide enhanced convenience to users, and also to provide surveillance capabilities to authorities. We’re already seeing developments in China around extensive use of facial recognition; walk up to the barrier at a train station and the gate opens for you, assuming your face resembles your national identity card. No worries if you’re feeling rough or having a bad hair day, there are sufficient data points to allow you through the barrier.
This negates the need for the widespread use of contactless cards that we currently see used extensively in the UK. This then has an impact on our banking regimes, as the technology advances we may see reduced demands for passwords and PIN numbers as we may just simply look at the ATM and ask for cash; ‘Alexa, can I have £60 please?’
It’s already possible to transfer money using an app and your face to authorise. Again in China 120 million people have access to a mobile payment app using their face as credentials. It’s possible to both transfer money and also get a loan simply by using your face as identification.
Ticket touts are the current scourge of getting into concerts (something close to my heart), but if your ticket is matched to your face then there is no unauthorised secondary ticket market. Getting access to sporting events could be made easier for the fan, whilst saving costs for the club.
In addition, whilst surveillance is still considered a delicate subject, tracking of movement through a venue allows for efficiencies in access areas and the targeting of relevant services to individuals. It could also allow tracking of movement through public transport systems for improved customer experiences.
We’ve heard a lot about body-worn police cameras recently. Ultimately these could be linked to central resources for the identification of known criminals making our streets a safer place.
Cars could be enabled to recognise an authorised driver, meaning no stolen cars and no lost keys. The list goes on.
Obviously this relies on a few things, one of the reasons that China is a large market for this is the large national database for identification purposes, and whilst some may not be comfortable with this in the Western world, there is a decision as to whether the benefits outweigh the use of your personal data – The Privacy Paradox applies.
It would also rely on suitably responsive infrastructure to support the use cases, but with the technology evolution you’ll soon be able to use public transport, buy goods and when you walk into Starbucks they will no longer need to ask your name, you’ll be recognised as you walk in, and this time the cup will have your correct name on it.
Now where is that false beard?
It feels slightly strange sitting in Scotland in December and its 12 degrees outside, we’re much more used to snow and a white Christmas. No doubt that Global Warming has affected our weather systems here.
However, in the world of data it’s becoming a polar opposite (see what I did there?). Data continues to grow and get colder by the day. We’ve become a society of data hoarders, we continue to store everything, never accessing but keeping it for that ‘just in case’ moment. This has led to the rise of ROT data; Redundant, Obsolete or Trivial content that is never accessed but continues to consume valuable resources.
A recent Veritas survey shows that only 14% of our data is accessed regularly, with a further 32% being classified as ROT data. The worrying statistic is the one I’ve not yet quoted; this means that 54% of our data is simply unknown, and like the majority of an iceberg sits unseen below our visibility.
This dark data may have business value, or may be valueless, but the crucial point being that it remains unknown. More worryingly, this dark data may contain personal customer information, non-compliant data or other high-risk corporate data, with the potential for critical risks at the core of a business.
Recent legislation changes mean that Data Governance has to become more critical to business operations, location of data, content of repositories and the ability to search and discover data of relevance, upon demand, is placing new and unique challenges for IT operations, challenges that they have never previously faced.
Illuminating dark data is not easy, it requires elimination of ROT, it requires understanding of corporate data and what data may have business value, and it requires further understanding of legislation relative to the customer environment. Finally the ability to find that needle requires the use of tools and the knowledge to understand what you are looking for.
Having the ability to seek across all data sets, and having the ability to apply filters to the searches is not an easy task, but one that you will face at some point. Identifying the process and the tools is a mission that needs addressing now, when you are asked for it may be too late to avoid significant costs and the potential for large fines if data cannot be produced in a timely manner.
The Data Iceberg is not melting, but at least we can understand the 54% not immediately visible to us. Our data hoarding exacerbates the problem, time to shine a light in the darkness.
Now, where’s my sunglasses?
*Information has been sourced from the recent Veritas publication; The Databerg Report: See What Others Don’t
I’m allowed some Star Wars geekery occasionally!
With the imminent launch of the latest Star Wars movie I turned to thinking about the generation of images used in movies. We think less and less about the computer generated images we see in movies, but are simply accepting of them as part of the action, even though the Wow factor is still there.
We know that those buildings are not really destroyed; the Golden Gate Bridge has not really been devastated 20 times in movies recently, so we know its Computer Generated Imagery (CGI), but have we ever thought about the technology required to create these sequences?
Most important in this process is the role of the storage environment; it’s imperative to be able to process images quickly and to be able to render images in a timeframe to minimise cost and production time.
This is one of the places that Flash-based storage arrays really shine; the ability to deliver output in a rapid fashion means that my Star Wars user experience happens in 2015, and not in several more years’ time.
Remember, the original Disney cartoons took several years to make but now several can be produced every year, Flash storage solutions are one of the key factors behind this.
Now, performance isn’t always everything, but in the film industry it can be.
Whilst I genuinely have no preference for technology vendors, occasionally there are just some things you just have to highlight. One of these has been our recent testing of the HP StoreServ 20850 storage array. Having recently achieved world record results in the SPC-2 tests the 20850 became an obvious candidate for Computacenter to evaluate whether the claims could be substantiated in a real world scenario.
The performance of this array has been blindingly fast, and is one of the few which actually matches the vendor’s claims in terms of performance. Having tested several vendors’ solutions, the HP 20850 has stood with the best of them in terms of both price and performance. Combining this with improved manageability makes the HP 20850 a compelling solution for customers across a wide range of applications, and supports customers in their move to the silicon datacentre.
The HP StoreServ represents a return to form for one of the major players in the storage industry, and is available for Customer Demonstration with a variety of either simulated workloads, or customer-specific tests utilising actual data, in the Computacenter Solution Centre based in Hatfield.
To (almost) quote Darth Vader; ‘HP StoreServ 20850- The Force is Strong with This One’
For those that don’t know, my background is in Mathematics & Physics which, as a wise man once pointed out to me, is why I have OCD tendencies around numbers.
I like precision, I don’t like estimates or guesstimates, and I’m not a big fan of vendor spreadsheets that show how their technology will reduce your Capex or Opex and provide virtually immediate ROI, because we all know there are so many variables that they cannot possibly be particulalry accurate.
If I followed these models ultimately I could go in ever-decreasing circles where I have ultimate performance, at little cost, with no footprint and it pays for itself before I’ve bought it. Hooray for that!
Back in my precise world it’s important that we know what it realistically achievable, and more importantly what is achievable in specific environments with specific applications. One thing we have learned is that whilst all storage technology may look similar from the outside, it doesn’t always perform in a similar manner. One thing I’m asked repeatedly is how to decide between vendor technologies and what is the optimal solution for customers.
The answer is not simple, there are many variables that can affect the performance of any storage environment, and why for specific workloads there will be a solution which will work better than others for specific criteria. When sizing storage solutions we need to look at a multitude of variables;
- Performance requirements in terms of IOPS, Latency & Bandwidth
- Read / Write ratios
- Application usage
- Block size in use
- Typical file sizes
- Whether compression is applicable,and how well data may compress
- Deduplication and how well data can be deduplicated
Now here comes the challenge; 64% of IT organisations don’t know their application storage I/O profiles & performance requirements; so they guess. The application owner may closely know the performance and capacity requirements, but adds extra to accommodate growth and ‘just to be safe’. The IT department takes the requirements and adds some more for growth and ‘just to be safe’ because ultimately we cannot have a new storage subsystem which does not deliver the required performance.
This means performance planning can be guesswork, with substantial under or more likely over-provisioning, and the unseen costs of troubleshooting and administration providing more significant overheads than should be necessary.
The ultimate result of this can be a solution which meets all the performance requirements but is inefficient in terms of cost and utilisation.
This is where Computacenter come in; working closely with our latest Partner LoadDynamix we can;
- ACQUIRE customer specific workloads and understand exactly the requirements
- MODEL workloads to understand the scale of solution required and ramp up workloads to find the tolerance of existing infrastructure
- GENERATE workloads against proposed storage platforms to ascertain optimal solution, and how many workloads can be supported on a platform
- ANALYSE the performance of proposed solutions which factual data, not vendor marketing figures
Coupling this approach provides an exact science for sizing the storage solution, and coupling this with Computacenter’s real world experience ensures my OCD tendencies can be fully satisfied.
The Computacenter / LoadDynamix Partnership announcement can be found here;
I like accuracy; working together with LoadDynamix we can achieve that not just for me, but more importantly for our customers and their users.
Coming Soon – Look out for the #BillAwards2015 announcing in December; want to know who wins these prestigious awards? Follow me on twitter @billmcgloin for all the answers
We’ve accepted that data has gravity and like any large body as it increases it draws an increasing amount of applications, uses and more data towards it.
We’ve also accepted that applications also have mass, growing in complexity through their evolution unless the painful decision to start again is taken.
Combine these factors with an increasing number of requests, an increasing request size and suddenly you have a significant impact on the access time and the bandwidth available to move data takes a hit.
If these factors apply in day to day operations, then consider the impact when you have to move large quantities of data from one place to another; whether as part of a re-platforming operation, or as a move to an archive, or possibly as a move to the fabled Data lake technology. Then the combined data gravity and Application mass can combine to have a seriously detrimental effect on the movement of data.
Whilst any admin can script the movement of data between platforms and numerous ‘free’ tools exist, the ability to move data rapidly and effectively between similar or dissimilar platforms in a rapid manner, minimising any outages and working around locked files and ensuring file permissions and configurations remain complete, becomes crucial for customers. Neither internal nor external customers accept data outages; we have to be always on.
In my career I have migrated PBs of data between storage arrays, and honestly it can be a long dull and ultimately boring process, certainly not the sexy storage world we’ve come to know and love. Moving data was never something to sit and watch; it was always a kick off and go for several cups of coffee (that may explain your caffeine addiction. ED).
Now, however, things are finally changing. Computacenter has recently partnered with Data Dynamics to move file data more efficiently and effectively than previously possible. Through the use of the Data Dynamics StorageX toolset Computacenter can offer movement of data detailing what moved, where it moved and even what didn’t move (and why). It does this whilst reducing disruption, decreasing migration time by 600% and reducing network load.
Combining these features with the ability to validate the configuration of the target system makes for a very compelling case and ultimately becomes significantly less expensive to a business than the ‘free’ tools available.
Moving data is a weighty matter, but that doesn’t mean it should be stressful.