It's no secret organizations today are dealing with data growth up to and beyond the petabyte level. This massive growth magnifies data management challenges, such as the overheads associated with storage acquisition and operation, as well as exacerbated data protection, governance, and security concerns due to regulatory issues and data mobility.
Modern businesses must find solutions that provide agility to both enterprise network users and to a geographically dispersed workforce. And the capability to leverage big data analytics to gain value from their vast data stores is critical for organizations to gain a competitive edge.
There are a number of choices to select from when building storage that will scale out to petabyte levels, as well as meet the needs of local and mobile users. Three choices are object storage (OBS), software-defined storage (SDS), and a newer technology approach that is quickly gaining traction within the industry -- data defined storage (DDS).
Unlike traditional approaches, object storage does not use a file system hierarchy to store data. Data is stored as objects and every object is assigned its own unique identifier. Users retrieve the stored data object using a unique "claim check" code; its actual location on physical media is abstracted within the pool of storage. This architecture allows for virtually unlimited scalability of the virtual storage pool.
The use of objects removes standard network file-sharing protocol access. Users generally access object storage through applications that use a REST API. This makes object storage ideal for all online, cloud environments.
However, to ingest existing file data and support data access for file based workflows, a third-party file gateway server must be added, with file protocols on one side and object API commands on the other. These gateways often eliminate many of the advantages of using object storage, including scalability, parallel object access, architectural flexibility, and improved data availability.
A rising number of vendors are promoting products as software-defined storage, however, at this time there are a variety of interpretations of the term. In general, software-defined storage is the abstraction of storage services from the physical storage hardware. The software abstracts hardware resources, pools them into aggregated capacity, and automates the action of distributing them, as needed, to applications.
Software-defined storage environments enable virtualized storage pools to manage siloed data across geographic sites and provide policy data management related to storage optimization, which reduces the cost of storage and storage administration, offering various options such as deduplication, replication, thin provisioning, snapshots and backup.
To a great extent, storage has always been defined by software; it's just that the software has normally been embedded into proprietary hardware platforms creating a storage appliance. With software-defined storage, it is abstracted to commodity hardware through storage virtualization software that reduces TCO and enhances infrastructure flexibility.
At Tarmin, we have developed data-defined storage, which uses a data-centric approach. It builds on the benefits of both object and software-defined storage technologies, as the Taneja Group explains in a recent product profile, "as a set of three interrelated areas: global storage that is independent of media types and storage vendors, data security and identity management across the entire infrastructure, and a distributed metadata repository that enables global searches and analytics."
Object and software-defined storage can only be mapped to the first of data-defined storage's three main attributes: media independent data storage, which enables a media agnostic infrastructure -- utilizing any type of storage, including low cost commodity storage to scale out to petabyte-level capacities. Data-defined storage unifies all data repositories and exposes globally distributed data stores through the global namespace, eliminating data silos and improving storage utilization.
In addition to media independent data storage, data-defined storage includes two other attributes: data security and identity management, delivering end-to-end information governance, protection, retention management, security and mobility; and distributed metadata repository, capturing the value of data across distributed data stores by collecting all basic metadata and custom metadata and conducting full-text indexing and filtering of all standard and industry-specific files.
Object, software-defined, and data-defined storage are all valid options to consider when scaling to meet the storage demands within today's business environments. The key is to remember that, while they may appear similar on the surface, they are in fact quite different upon closer examination. It is important to keep these differences in mind when determining the best fit for your data storage, governance, accessibility, and analytics needs.