As the rush towards cloud services accelerates, it is clear we are fast moving into the multi-cloud era. While the likes of Amazon, Google, Microsoft and others all offer a wide variety of services, enterprises are spreading their workloads across multiple providers.
This mitigates the risk of a single cloud provider failure and allows enterprises to utilise the best of breed technology for specific applications.
While there are many good reasons for adopting a multi-cloud strategy – Flexera’s research suggests that typical organisations deploy three or four different private cloud services – it does create a challenge. How does an organisation get a consolidated view of data when it’s spread across multiple platforms?
The temptation for organisations is to see their cloud-based systems in the same way as on-prem and apply similar techniques and technologies. That would mean adopting the legacy approach of extracting data, transforming it to conform to a new data structure and then loading it into a central data warehouse.
The ETL (extract, transform, load) approach was the bedrock of data warehousing solutions. But any business that has taken this approach understands the challenges and its limitations.
Adding new data sources is complex, extraction and load processes take a lot of time, and the complexity of such systems hampers their reliability.
As the ETL approach is both time and cost intensive, its ability to adapt to answer new questions that the business might ask is limited. So, while modern cloud technologies offer flexibility, scalability and agility, analytics approaches are mired in the past.
The beauty of cloud services is that they give an insight into how data analytics can be vastly improved from the traditional approach.
Cloud services communicate through APIs and other data exchange methods. By leveraging decentralised data analytics, the same types of tools can be used to bring data from multiple sources, hosted on different cloud services, into a single view.
This is the critical difference. The data is not centralised into a single database. Rather, by using a data fabric to have a single pane of glass you have access to data across your multi-cloud environment.
This data fabric uses metadata to know where the data is, its structure and the governance rules around its access. This makes it possible to create a view of the data that looks like it is in a single database, but has not been copied or moved.
With data warehousing projects often failing, there’s a need for organisations to adopt a networked data platform approach. This new way of approaching analytics projects, which eschews ETL, offers many benefits.
The cost of projects is vastly reduced as there is no need to build new databases and acquire storage to hold data. And, as there’s no need to develop ETL tools, it is far faster to initiate a project. That increased simplicity also makes it easier to add new data sources so business users can access new and different data sources faster and gain accurate real-time insights.
Security is also enhanced. As data is not copied to another database, it retains all the original governance and security rules that are in place at the source. It also ensures your corporate data is not centralised, creating a honeypot for threat actors. The data fabric maintains the original data access rules.
The advent of multi-cloud architectures has the potential to create new data silos for organisations. A virtual data warehouse that leverages data fabric technology enables enterprises to reap the benefits of different cloud platforms without compromising access to data and adding complexity to analytics projects. This saves time, decreases complexity and gives businesses a competitive advantage to unlock data insights in real-time.