Databricks to acquire Tabular to boost data interoperability
Databricks, the Data and Artificial Intelligence (AI) company, has announced an agreement to acquire Tabular. Tabular, founded by the original creators of Apache Iceberg, specialises in optimising data stored in the cloud. This strategic acquisition is poised to enhance the interoperability between Delta Lake and Iceberg, two leading open-source table formats that have hitherto faced compatibility challenges.
Databricks has been at the forefront of pioneering the lakehouse architecture since its inception in 2020. The architecture integrates traditional data warehousing workloads with AI workloads on a single, governed copy of data, enabling diverse workloads, applications, and engines to access the same data. This approach contrasts with proprietary data warehouses, where vendor lock-in imposes significant restrictions on data access and utilisation.
Currently, 74% of enterprises have adopted the lakehouse architecture, underscoring the growing demand for this model. Central to the lakehouse's functionality are open-source data formats that facilitate ACID transactions on data stored in object storage. Databricks' collaboration with the Linux Foundation led to the creation of the Delta Lake project, which has garnered contributions from over 500 code contributors globally and is utilised by more than 10,000 companies each day to process over four exabytes of data on average.
Concurrent with the development of Delta Lake, the Iceberg project was initiated by Ryan Blue and Daniel Weeks at Netflix and subsequently donated to the Apache Software Foundation. Despite their common foundation in Apache Parquet and shared goals, Delta Lake and Iceberg evolved independently, leading to incompatibilities. This fragmentation has often resulted in fragmented and siloed enterprise data, undermining the value of the lakehouse architecture.
Recognising the necessity for data interoperability, Databricks launched Delta Lake UniForm last year. UniForm tables facilitate interoperability across Delta Lake, Iceberg, and Hudi, supporting the Iceberg restful catalogue interface. This enables companies to utilise familiar analytics engines and tools across all their data. With the acquisition of Tabular, Databricks aims to expand the capabilities of Delta Lake UniForm significantly.
"We created Apache Iceberg to solve critical data challenges around correctness, performance, and scalability," said Ryan Blue, Co-Founder and CEO of Tabular. "With Tabular joining Databricks, we intend to build the best data management platform based on open lakehouse formats so that companies don't have to worry about picking the right format or getting locked into proprietary data formats."
Ali Ghodsi, Co-founder and CEO of Databricks, highlighted the broader implications of this acquisition. "Databricks pioneered the lakehouse, and over the past four years, the world has embraced this architecture, combining the best of data warehouses and data lakes to help customers decrease TCO, embrace openness, and deliver AI projects faster," Ghodsi stated. He emphasised the role of Delta Lake UniForm in bridging the gap between Delta Lake and Iceberg formats, thereby reducing silos and increasing interoperability for customers.
Both Databricks and Tabular have a history of advocating for open-source formats. Databricks, which is recognised as one of the most successful independent open-source companies by revenue, will continue to uphold its commitment to open formats and open-source data in the cloud through this acquisition. Together, the companies aim to mitigate the challenges posed by proprietary vendor-owned formats and ensure enterprises maintain control over their data.
The acquisition remains subject to customary closing conditions and is anticipated to be finalised within Databricks' second fiscal quarter. As the companies embark on this collaborative journey, the data management landscape is set to witness significant advancements, particularly in enhancing data interoperability and realising the full potential of the lakehouse architecture.