If primary storage supports mission-critical applications, secondary storage is for everything else — file shares, backups, development and testing data, and the like. Nobody’s really managing it and no one’s sure what it is but it’s growing at an astounding rate.
According to a 2016 survey by Osterman Research, midsize and large organizations store approximately 20TB of unstructured data, on average, accounting for about 85 percent of the data they manage. Unstructured data volumes are increasing by about 20 percent each year. These trends impact secondary storage in two ways — increasing the size of both file shares and backups.
While primary data storage is architected using high-performance devices that are carefully configured and rigorously protected, secondary storage is typically a hodgepodge of backup devices, NAS appliances and other gear that has been deployed in a piecemeal fashion.
One reason is that secondary storage comprises so many different workloads and data types with vastly different requirements. Organizations tend to address each use case separately by implementing a point solution. As data volumes increase, more devices are added.
Organizations are starting to recognize that this simply isn’t sustainable.
IT teams are grappling with a sprawling storage environment that lacks an overarching management framework. On top of that, numerous studies have shown that much of the data in secondary storage is obsolete, redundant or otherwise functionally useless. That means there’s a tremendous opportunity to consolidate this environment and eliminate waste.
Hyper-convergence principles can help organizations get a better handle on secondary data storage.
Hyper-converged infrastructure solutions also typically include data management capabilities such as de-duplication and compression.
But from a storage perspective, the software-defined nature of hyper-convergence is key. In fact, the development of software-defined storage — which separates storage services from the hardware platforms they run on — is directly linked to the emergence of hyper-converged solutions. The typical hyper-converged system uses a standard x86 server with direct-attached storage that’s presented as a virtual SAN.
Hyper-converged platforms can be scaled horizontally by adding nodes, making it possible to create a logical pool of storage capacity. Automation features streamline operations and enable the policy-based movement of data.
Some hyper-converged systems are best applied to primary data storage, but others are well-suited to the secondary storage environment.
Once you have one scalable software-defined infrastructure that can support backup, archival, disaster recovery and other secondary workloads, you can consolidate the secondary storage environment. IT teams spend less time administering storage point solutions and can manage the data that’s stored there more effectively.
Data management is likely to drive investments in hyper-converged storage solutions. More and more, organizations are looking to use analytics to harness the business value of their data stores. Hyper-converged storage helps to minimize the number of data silos that set up obstacles to data analytics.
Secondary data storage typically isn’t managed with the same rigor as primary storage, largely due to the fragmented nature of the environment.
Hyper-converged storage solutions can help organizations consolidate disparate storage devices, relieving administrative headaches and better supporting data analytics initiatives.