IT Brief New Zealand logo
Story image

Cloudera and NVIDIA partner on cloud-based data analytics

14 Apr 2021

Cloudera has announced a collaboration with NVIDIA, which will enable the Cloudera Data Platform (CDP) to integrate the RAPIDS Accelerator for Apache Spark 3.0, deployed on NVIDIA’s computing platforms.

The RAPIDS Accelerator for Apache Spark 3.0 will enable enterprises to create data pipelines around artificial intelligence (AI) and machine learning (ML), without the need to change code, whilst delivering analytics and insights.

According to Cloudera, datasets are growing larger and used for more purposes, such as responding to fraud, developing products, and transforming the supply chain.  However, that data creates bottlenecks for data scientists who cannot work quickly enough to train and operate models across their organisations.

Organisations can use an open source, GPU-accelerated platform, such as Apache Spark 3, to train models using AI and ML. As Cloudera Data Platform now supports this functionality, there are more opportunities for high-performance compute, data science, and AI support from research right through to production.

Cloudera chief product officer Arun Murthy says, “At a time when speed is everything, businesses are relying on the power of data more than they ever have. Our collaboration with NVIDIA will give customers the rocket fuel they need to better understand their data and realise the true transformational potential of AI.”

Murthy adds that the deeper integration with NVIDIA is the natural next step for the company.

NVIDIA data science product group senior director Scott McClellan adds, “Apache Spark is a cornerstone of the machine learning and data analytics pipelines enterprises rely on to remain competitive.”

“The processing power of NVIDIA-accelerated computing and Spark analytics running on Cloudera Data Platform provides the flexibility to meet deadlines when time is of the essence, and save on costs when the bottom line is most important.”

RAPIDS Accelerator for Apache Spark will be available in CDP Private Cloud this year. Cloudera and NVIDIA will also roll out other offerings over time, including Accelerated Deep Learning and Machine Learning in CDP Public Cloud in May.

Earlier this month Cloudera announced that CDP will be available through Google Cloud Platform. CDP on Google Cloud helps to create data lakes within any organisation's Google Cloud. This enables organisations to run analytics and machine learning, migrate pipelines to Google Cloud, and create new pipelines. 

“Cloudera Data Platform on Google Cloud is an ideal first step for our joint customers to support hybrid cloud," says Murthy.

“Companies can run advanced analytics from our data lifecycle platform and easily extend or replicate the use cases on premises to Google Cloud.”