Story image

New AI capabilities enabled with LiveData now automating Apache metadata into Databricks

By Ryan Morris-Reade, 25 May 2021

WANdisco, a LiveData company, has announced its LiveData Migrator platform can now automate the migration of Apache Hive metadata directly into Databricks, enabling users to quickly enable new AI and machine learning capabilities.

The LiveData Migrator platform, which automates the migration and replication of Hadoop data from on-premises to the cloud, can help users save time and reduce costs. WANdisco says for the first time, enterprises wanting to migrate on-premises Hadoop and Spark content from Hive to Databricks, can do so at scale and efficiently, while managing the risks associated with large-scale cloud migrations.

It says datasets don’t need to be fully migrated before they are converted into the Delta format. LiveData Migrator automates an incremental transformation to Delta Lake.

Ongoing changes to source metadata are instantly reflected in the Lakehouse platform, and on-premises data formats used in Hadoop and Hive are automatically made available in Delta Lake on Databricks. 

By combining data and metadata and making on-premises content immediately usable in Databricks, users can eliminate migration tasks that previously required constructing data pipelines to transform, filter, and adjust data, along with up-front planning and staging. This means work otherwise required for setting up auto-load pipelines to identify newly-landed data then and converting it into its final form is no longer required.

“This new feature brings together the power of Databricks and WANdisco LiveData Migrator,” says WANdisco CTO, Paul Scott-Murphy. 

“Data and metadata are migrated automatically without any disruption or change to existing systems. Teams can implement their cloud modernisation strategies without risk, immediately employing workloads and data that were locked up on-premises and are now in the cloud, by using the Lakehouse platform offered by Databricks.”

Databricks VP of product partnerships, Pankaj Dugar, says enterprises often want to break silos and bring all their data into a lakehouse for analytics and AI, but they've been constrained by their on-premises infrastructure. He says with the new Hive metadata capabilities in WANdiscos LiveData Migrator, it will now be easier to take advantage of the Databricks Lakehouse platform.

LiveData Migrator automates cloud data migration at any scale by enabling companies to migrate data from on-premises Hadoop-oriented data lakes to any cloud within minutes, even while the source data sets are under active change. 

Businesses can migrate their data without the need of engineers or other consultants to enable their digital transformation. LiveData says its Migrator works without any production downtime or business disruption while ensuring the migration is complete and continuous, and any ongoing data changes are copied to the target cloud environment.

Users can choose to convert content to the Delta Lake format when they create the Databricks metadata target. The data they choose to migrate is then set by defining a migration rule, and then selecting the Hive databases and tables that require migration.

Recent stories
More stories