Here are two quick questions for which IT managers should have a ready answer: Where’s your critical data and how quickly can you access it? Do you have the right infrastructure to manage it?
For a variety of reasons, many IT teams cannot answer these questions easily because they do not have command of their data landscape.
The primary problem is uncontrolled duplication. Today, the average company maintains more than nine copies of a single piece of information. “Data” in fact is no longer a singular entity – it’s many similar, but often distinct copies of the same information. A typical company might have seven, eight or more copies of their data, each of which is intended to support a singular purpose: business resiliency, disaster recovery, DevOps, incident response, legal workflow, archive, ediscovery, and on and on.
Most companies have all these copies of data scattered across numerous data warehouses and data lakes. Getting to the right data at the right time can be a complicated, Herculean feat. When something goes wrong and a particular piece of information is needed, “How do I get the right data to solve this challenge?” becomes the top question, and today’s complexity means there’s no easy answer. This is something more businesses are confronting. Over the past two years, more than 75 percent of businesses have been unable to surface the right data, according to a 2017 survey by Forrester. It is a two-headed problem, caused by both the number of data copies and the increasingly intricate data infrastructure IT must navigate.
Such complexity crept into our data management over time, born of siloed operational channels, shifting requirements, changing operational needs, and architectural evolution.
A typical story might go something like this: the legal staff needed data but they don’t know the “backup people” (and probably wouldn’t have talked to them anyway), and then legal probably couldn’t trust that the right data for discovery could be obtained in a legally permissible way. Thus, it became inevitable that the legal team demanded and now retains its own backup copy – and so it goes across the organization.
Data copies are siloed not just by department or use case, but also by platform. Backup solutions span technologies, including traditional on-premises backup for physical servers and databases, hybrid solutions that run on-premises but push data into designated archives (cloud or local), hosted backup (from a managed services provider or MSP), backup-as-a-service, and, in some cases, backups from edge computing such as IoT systems. The Forrester survey found that 79 percent of organizations have at least three backup solutions, and more than 26 percent have five or more solutions.
Today, this landscape is further complicated by increasing dependence on SaaS solutions such as Salesforce, G Suite and Office 365, resulting in additional silos and making it more difficult to reliably protect data.
Digital transformation hinges on tearing down data silos and creating a more complete data lake from which new intelligence, more effective decision-making, and informed automation emerge. In this new world, protecting data has become a data management requirement that demands a holistic and unified approach. And it’s important to note that keeping multiple copies of data for different purposes ultimately creates more problems than it solves.
This fragmented and siloed approach to data management isn’t sane or manageable, particularly given today’s rate of data growth. In fact, more than half (56 percent) of IT professionals cite “exponential data growth” as a challenge.
Regulatory and compliance requirements (such as GDPR and others) fuel increased data volumes, with large enterprises stating that these requirements are the No. 1 driver of data growth. Compounding the challenge, regulations like GDPR require organizations to have a better handle on their data than ever.
This problem was not created overnight, and it won’t be solved quickly, but the shape of the solution is becoming clear. We must use the cloud to unify our view of all of the information spread across data silos in this landscape. Beyond the well-known business benefits of the cloud (cost, simplicity, instant scalability and agility), the cloud makes it easier to augment the enterprise’s architecture to support different use cases. Using the cloud in this way makes it feasible to use a single dataset to support each unique requirement. Because the cloud is inherently collaborative, it enables a unified approach that allows different departments to address distinct use cases using the same underlying data.
As we continue to see, the need for disaster recovery planning is only increasing in the wake of natural disasters, and cloud-based operations and backup copies offer an immediate solution for data recovery needs. They eliminate the need for offsite locations and enable business continuity without the need for specialized, dedicated resources or large investments.
But data management is more than backups. DevOps teams can take advantage of continuous cloud backups to satisfy their hunger for the latest copy of data to support A/B testing and rapid development. Backup data can be analyzed for machine learning and AI applications. For example, a medical device company can learn from its archive of surgical data to improve the precision times of its devices.
Incremental progress is the way. This refocus on data – in the singular sense of the term but modernized to reflect the need for a proper data lake – won’t be instituted enterprise-wide in a single push. Use cases such as disaster recovery, forensics and DevOps are already accustomed to working directly with data management and therefore provide natural starting points for leading the transition to data sanity.