Follow these tips for securing corporate data when using public cloud services.
While cloud security remains a top concern in the enterprise, public clouds are likely to be more secure than your private computing setup. This might seem counter-intuitive, but cloud service providers have a leverage of scale that allows them to spend much more on security tools than any large enterprise, while the cost of that security is diluted across millions of users to fractions of a cent.
That doesn't mean enterprises can hand over all responsibility for data security to their cloud provider. There are still many basic security steps companies need to take, starting with authentication. While this applies to all users, it's particularly critical for sysadmins. A password compromise on their mobiles could be the equivalent of handing over the corporate master keys. For the admin, multi-factor authentication practices are critical for secure operations. Adding biometrics using smartphones is the latest wave in the second or third part of that authentication; there are a lot of creative strategies!
Beyond guarding access to cloud data, what about securing the data itself? We’ve heard of major data exposures occurring when a set of instances are deleted, but the corresponding data isn’t. After a while, these files get loose and can lead to some interesting reading. This is pure carelessness on the part of the data owner.
There are two answers to this issue. For larger cloud setups, I recommend a cloud data manager that tracks all data and spots orphan files. That should stop the wandering buckets, but what about the case when a hacker gets in, by whatever means, and can reach useful, current data? The answer, simply, is good encryption.
Encryption is a bit more involved than using PKZIP on a directory. AES-256 encryption or better is essential. Key management is crucial; having one admin with the key is a disaster waiting to happen, while writing down on a sticky note is going to the opposite extreme. One option offered by cloud providers is drive-based encryption, but this fails on two counts. First, drive-based encryption usually has only a few keys to select from and, guess what, hackers can readily access a list on the internet. Second, the data has to be decrypted by the network storage device to which the drive is attached. It’s then re-encrypted (or not) as it’s sent to the requesting server. There are lots of security holes in that process.
End-to-end encryption is far better, where encryption is done with a key kept in the server. This stops downstream security vulnerabilities from being an issue while also adding protection from packet sniffing.
Data sprawl is easy to create with clouds, but opens up another security risk, especially if a great deal of cloud management is decentralized to departmental computing or even users. Cloud data management tools address this much better than written policies. It’s also worthwhile considering adding global deduplication to the storage management mix. This reduces the exposure footprint considerably.
Finally, the whole question of how to backup data is in flux today. Traditional backup and disaster recovery has moved from in-house tape and disk methods to the cloud as the preferred storage medium. The question now is whether a formal backup process is the proper strategy, as opposed to snapshot or continuous backup systems. The snapshot approach is growing, due to the value of small recovery windows and limited data loss exposure, but there may be risks from not having separate backup copies, perhaps stored in different clouds.
On the next pages, I take a closer look at ways companies can protect their data when using the public cloud.
Companies should enforce strong passwords for users accessing cloud data, and many will want to go beyond password protection to multi-factor authentication. Requiring more than a password implies multi-factor authentication. The second factor may be a fingerprint, voiceprint or retina scans. An alternative is a couple of “challenge questions” that only the user should know how to answer.
Even with strong authentication, however, there's still the insider risk, often a disgruntled admin or coder. Sophisticated shops can apply data analytics to spot strange access patterns, such as downloading key files or prying into areas unrelated to work assignments. Personally, I’d lock out the USB port on servers, but that doesn’t stop a mobile or browser-based action. The only answer for securing admin and coder access is to zone access rigorously. Keep this on a need-to-know basis.
VPNs in the cloud are “free” to set up and manage, so use them to help protect data. Limit users to their own areas and lock them off from visibility from the rest. This is especially important for protection against command-and-control botnets, since these allow attackers to wreak a lot of damage with little effort. Layer access as much as possible, so that clear paths to a given data set simply don’t exist. Traffic monitoring is useful here, since unusual downloads or uploads can flag suspicious activity.
Data management tools
We regularly hear of buckets of data objects left exposed. These orphans are due to plain carelessness on the part of some admin, but nobody is perfect and complexity is growing fast. Often, though, they're in plaintext, completely readable and sometimes the contents are spread all over the web. That gets IT people fired!
The solution is to deploy data management software tools that search out orphans in your clouds, and position data and monitor usage. This is a hot IT area with a growing number of good solutions.
Separately, within virtual machines, temporary local instance drives can also create a data exposure. Do you know what happens to your data if an instance crashes, for example? You can’t find it, but the server may be dead, with a movable drive with your files (encryption keys for example!). If the crash is temporary, how does the cloud service provider protect against your data being readable? SSDs are especially an issue here. Their spare block pools are large and only erased in background. It takes savvy on the part of the CSP to prevent a new tenant from reading into the spare space.
(Image: Kitiphong Pho/Shutterstock)
Data sprawl is a likely the consequence of cheap rented cloud storage. Why delete when it only costs a few cents? Here is where deduplication of objects can help. This is not compression, which looks at data repetitions inside objects to find possible size reductions. Deduplication ends up with a properly protected single copy of any given object. All the other uses of or references to that object are just pointers in a metadata file.
Deduplication does two things: It saves space on the drives and simplifies management. The latter is in fact more important, especially as we add analytics and indexing tools to storage systems. Any admin who has spent an hour searching directories of files with the same names understands this value proposition.
Copy management allows data to be adequately protected, too. It’s much easier to guard single copies than dozens with the same IDs.
Do you encrypt your files? According to at least one study, data encryption remains spotty, and that’s the root of all those disastrous data leaks. Not encrypting data at rest in the public cloud is a dereliction of duty! A few do's and don’ts:
- Use AES or better encryption
- Encrypt file or object names or at least have them in an encrypted metadata file
- Do not use one key for all objects
- Limit the list of admins who know the keys
- Encrypt at source, so transmission packet sniffing hacks will fail
- Do NOT use drive-based encryption, since the (short) key list is available on the web for most drives. Look carefully at cloud service provider encryption options.
Backup and DR
There's much debate in the industry about whether continuous backup or snapshots are a better way to protect data versus the traditional separate backup copy software. Snapshots are attractive due to the great metrics they deliver, but there are complicated questions as to which zones to use to merge DR with snapshots, whether backup within the same cloud service provider storing your primary data is wise, as well as concern about potential additional exposure from having no demountable copy.
Done well, a snapshot perpetual storage strategy with multi-zone or multi-cloud replication looks very promising, both economically and from a data protection view. I advise doing some research on the subject, though, since not all solutions are equal.