Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

The Importance of Managing the Entire Lifecycle of Metadata

Metadata is data about data, which has a broad scope of influence from system and application behavior to user experience. Metadata determines the ability and methods that a user account or security group can interact with a file.

“File permissions or access control lists (ACLs) are metadata that define the actions each user or group are permitted to perform on the file. These permissions are enforced by the operating system” – Howard Marks (@DeepStorageNet).

Many different types of systems and devices from applications to firewalls to use ACLs to determine access permissions. Maintaining ACLs to ensure that they remain up to date and correct is an important task, stale entries can leave systems vulnerable and missing entries can prevent systems from functioning.

A key component of automation is quickly and correctly identifying one or many objects based on attributes attached to the object. The influence of metadata has on our ability to search cannot be understated. However, for a search to return correct results, the data within the attributes must be update to date.

Attributes and properties are a type of metadata that typically forms part of the object in question. An object is a generic term to reference an item with of any type logically. It could be a file, user account, or physical server, for example. First name, last name, UID, and locked out status are all common attributes of a user account object.

Tags are a type of metadata that is attached to an object but do not form part of the object itself. The presence (or absence) of a specific tag can influence how different systems interact with each other. For example, a server may not get backed up if the correct tag is missing. Configuration managers frequently use tags to determine which configuration workflows to apply on a server.

Tags are most commonly applied as an array of strings or as a key-value store. The system which manages the object determines tagging format. The key-value format is typically represented in either YAML or JSON format and may or may not allow the use of arrays to set multiple values for a given key.

Avoid complications

Manual metadata management is a futile effort, even in small environments. The application, management, and removal of metadata should always be a programmatic function, mismanagement at the object level can have significant upstream impacts.

It is essential to understand the use case for why metadata is added to an object and the scenarios which would require the object’s metadata to be updated or removed. Understanding these scenarios helps you to identify existing workflows which may require updating to cater to the new requirements and the conditions to evaluate.

Workflows that manage the lifecycle of tags frequently contain functions to validate an object has the correct tags assigned. These functions are often used to determine the success of a task to add, update, or remove a tag but the functions are also useful for searching.

It’s a good idea to understand how the system that you’re interacting with handles consistency when working validation functions. Assumptions about consistency can lead to unexpected results or unhandled errors. For example, if you create an object and then immediately query that object, you might find that the query returns no results. This can be because the system that responds has not yet finished processing the results from the creation task. Similar behaviors can be observed when managing metadata, where the system executing the CRUD operation is not the same as the system which responds to queries.

There are no simple or one size fits all solutions for handling data inconsistencies. But understanding the behavior of the system that you’re interacting with helps you make allowances.

Metadata is data to those who work with it and should be treated with the same level importance as any other data. Poor management of metadata can impact negatively on upstream systems, which can include systems for regulatory compliance. When writing workflows to create metadata for an object, including workflows to update and remove metadata as well.

Special thanks to Karen Lopez (@datachick) for her assistance with this article.