The revolving door of data breaches of late – Rackspace, Uber, LastPass, the list goes on – makes it clear that any business without a proactive data protection strategy is leaving itself, and its customers, at risk. With every data breach that passes, news cycles are consumed by the shiniest new cybersecurity measures that can be taken to thwart future threats. However, a good security posture necessitates data security as well as cybersecurity.
What is data classification?
In developing strategies to protect vulnerable information, there's an unsung hero of data security: data classification. Organizations that tier, organize, and label data are not only simplifying their data but are enhancing their compliance and protection against ransomware.
Data security vs. cybersecurity: the data protection umbrella
In understanding data classification as a key strategy for data protection, it’s crucial to first grasp the difference between data security and cybersecurity. While these two terms fall under the data protection umbrella, they serve two different purposes, both of which are essential to protecting vulnerable data.
Cybersecurity is prevention focused. Its strategy is concerned with the strength of the network, its traffic patterns, and access.
Data security is recovery-focused, assuming from the onset that the network will eventually be breached. This approach focuses on doing everything possible to prevent any possible data theft, encryption, or deletion in the event of a breach.
Data security has become a daunting task, considering the sheer amount of data that is now housed online. Data classification offers a solution to this inundation of data.
Types of data classification
New trends in data protection mark a key shift from protecting all data en-masse toward implementing data classification: tiering and group-based classification systems. Classification for backup and data protection involves an abstraction layer, which manages instances, buckets, and their files, folders, and prefixes across the cloud.
Rather than protecting everything – which is expensive, time-consuming, and difficult to recover – it identifies which folders actually have critical information. In doing so, it provides a mechanism to classify data across cloud accounts to ensure that critical data is protected to meet business requirements and industry regulations.
Particularly, teams can classify data based on change rates, identifying the point from which the data needs to be recovered, whether that’s the last version, a week back, or a month back. Data classification is also tied to managing the location of data, as this data can be better located if it’s organized through classification. This can aid businesses greatly as they try to adhere to privacy laws and retention guidelines.
How data classification supports data protection
To fully grasp how data classification can enhance data security, it's helpful to see how it works in practice.
Data classification in the healthcare industry
Healthcare data lakes can provide key insight here. These sources are full of sensitive information – and are therefore governed by strict rules dictated by HIPAA. For instance, while organizations only need data for about 30 days, health information must be backed up for six years. To solve this, instead of backing up every object for six years, organizations can classify by access patterns and use object tags, tiers, and other metadata to ensure security against ransomware and compliance while reducing costs by over 90%.
Data classification in EdTech
This approach also pays dividends for those in the EdTech industry, as EdTech must adhere to strict regulations to protect the data of students. Particularly, COPPA – or the Children’s Online Privacy Protection Act – imposes requirements on operators of websites or online services collecting personal information online from a child under 13 years of age. These compel operators of websites and online services for students to retain personal information for only as long as it’s needed to fulfill the purpose it was collected for.
After that’s over, this data must be deleted securely, ensuring that no one is able to access or use it. Data classification is essential to identifying how data is being stored and backed up. By having this practice in place, EdTech providers can easily find and delete sensitive data, maintaining compliance with regulations and maintaining a strong security posture.
Data classification in financial services
Financial services is another industry that deals with large amounts of sensitive data, and therefore, it’s unsurprising that they are also under increasingly strict compliance regulations. The most well-known of these regulations is the Payment Card Industry Data Security Standard (PCI-DSS), which applies to companies that store, process, or transmit credit card payment information.
These regulations – enforced by federal or government organizations such as the SEC and FINRA in the US – impose a range of requirements on financial services companies, including the need to implement strong access controls and establish robust data protection and retention policies.
To stay compliant, companies must ensure that data is backed up securely for periods of up to seven years. It is impossible to protect everything in financial data lakes for seven years, and any attempt to do so would put a large drain on business costs. Implementing a data classification system allows organizations in financial services to adhere to these backup standards without runaway costs.
Considering the increasing velocity of data and the breadth of regulations, businesses must shift with the times and implement data classification to maintain regulatory compliance and ensure proper data security. Data classification allows businesses to preserve data in far greater amounts, with far less expenditure and effort – making it a boon to companies who hope to stay agile and proactive in their data protection.
Woon Jung is the Chief Technology Officer and co-founder at Clumio.