It feels slightly strange sitting in Scotland in December and its 12 degrees outside, we’re much more used to snow and a white Christmas. No doubt that Global Warming has affected our weather systems here.
However, in the world of data it’s becoming a polar opposite (see what I did there?). Data continues to grow and get colder by the day. We’ve become a society of data hoarders, we continue to store everything, never accessing but keeping it for that ‘just in case’ moment. This has led to the rise of ROT data; Redundant, Obsolete or Trivial content that is never accessed but continues to consume valuable resources.
A recent Veritas survey shows that only 14% of our data is accessed regularly, with a further 32% being classified as ROT data. The worrying statistic is the one I’ve not yet quoted; this means that 54% of our data is simply unknown, and like the majority of an iceberg sits unseen below our visibility.
This dark data may have business value, or may be valueless, but the crucial point being that it remains unknown. More worryingly, this dark data may contain personal customer information, non-compliant data or other high-risk corporate data, with the potential for critical risks at the core of a business.
Recent legislation changes mean that Data Governance has to become more critical to business operations, location of data, content of repositories and the ability to search and discover data of relevance, upon demand, is placing new and unique challenges for IT operations, challenges that they have never previously faced.
Illuminating dark data is not easy, it requires elimination of ROT, it requires understanding of corporate data and what data may have business value, and it requires further understanding of legislation relative to the customer environment. Finally the ability to find that needle requires the use of tools and the knowledge to understand what you are looking for.
Having the ability to seek across all data sets, and having the ability to apply filters to the searches is not an easy task, but one that you will face at some point. Identifying the process and the tools is a mission that needs addressing now, when you are asked for it may be too late to avoid significant costs and the potential for large fines if data cannot be produced in a timely manner.
The Data Iceberg is not melting, but at least we can understand the 54% not immediately visible to us. Our data hoarding exacerbates the problem, time to shine a light in the darkness.
Now, where’s my sunglasses?
*Information has been sourced from the recent Veritas publication; The Databerg Report: See What Others Don’t