Cloudera launches new software for the shared data experience
Cloudera has a new platform on the market that's all about giving companies a 'shared data experience.'
Cloudera SDX is a modular software platform that applies a centralised, consistent framework for schema, security, governance, and data ingest to enable different customer applications to run against shared or overlapping sets of data.
SDX aims to make multi-function data applications easier to develop, less expensive to deploy and more consistently secure.
"Companies often cite security, governance, and complexity among their primary reasons for not moving their operational workloads to cloud," comments Tony Baer, principal analyst at Ovum.
"Cloudera has planted its stake in the ground by building in the security and data governance features to make companies confident in standing up their big data workloads in production. Cloudera SDX builds on the company's IP with a shared data experience across cloud and on-premises environments.
According to Cloudera, SDX addresses the following business challenges:
Siloed data
Self-service clusters in the cloud do not naturally share data and metadata, so individual clusters become de facto silos.
By sharing persistent data and metadata across on-demand applications and transient clusters, Cloudera customers can stay agile and ensure each isolated cluster does not require individual control and management nor incur the additional cost of data replication and storage.
Security breaches
Without centralized security controls, administrators are forced to continuously reapply security and access policies against multiple copies of siloed data, creating extra work and greater risk of exposing sensitive information.
With Cloudera SDX, security is applied consistently at the data level. Policies are pervasive and do not need to change or be reapplied when the data is moved or used within a new analytics application.
Governance challenges/noncompliance
A shared data and metadata catalog is imperative for dealing with HIPAA and PCI-DSS compliance today and looming requirements like GDPR.
A shared data catalog makes it easy to quickly find and understand the context of data, enabling self-service applications and providing inviolable audit and lineage functionality.
The following features are available next month in Cloudera 5.13 and deliver enhanced SDX capabilities for cloud environments:
- Multi-cluster catalog, a Hive metastore based on shared Amazon RDS or shared MySQL for Azure users, to store and manage context about data
- Multi-cluster Cloudera Navigator capabilities that make it easier for users to discover data and govern access, meet audit requirements, and understand lineage
- Multi-cluster Sentry security permissions and policies to provide granular, role-based access controls to shared data
- The same Cloudera Manager interface for clusters anywhere, for more simplified operations and enhanced data authentication
- Backup and disaster recovery from on-premises clusters to Amazon S3
"SDX is the 'secret sauce' within Cloudera Enterprise that accelerates data science, machine learning, and analytics," adds Mike Olson, founder and chief strategy officer at Cloudera.
"Data is the world's most valuable resource. It is the fuel that drives insights, powers machines, and solves impossible problems. From day one Cloudera's focus has been on helping companies extract value from their data.
"Cloudera SDX simplifies this mission for IT and business users alike.