Story image

Data lakes transforming the enterprise data warehouse

14 Jan 16

The introduction of data lakes is one of the most significant changes to enterprise data warehouse technology, according to CenturyLink’s head of business development Martin Hooper.

Hooper says classic enterprise data warehouse architecture is evolving under the influence of new technologies, new requirements, and changing economics.

He says data lakes, large storage repositories and processing engines, are transforming the way data is handled by enterprises.

“Data lakes let enterprise data warehouses store massive amounts of data, offer enormous processing power, and let organisations to handle a virtually limitless number of tasks at the same time,” Hooper explains.

Classic enterprise data warehouses have sources feeding a staging area, and data that is consumed by analytic applications, he says.

“In this model, the access layer of the data warehouse, known as the data mart, is often part of the data warehouse fabric, and applications are responsible for knowing which databases to query.”

According to Hooper, in modern enterprise data warehouses, data lake facilities based on the Apache Hadoop open source software framework replace the staging area that sits at the centre of traditional data warehouse models. While data lakes provide all of the capabilities offered by the staging area, they also have several other important benefits, he says.

 “A data lake can hold raw data forever, rather than being restricted to storing it temporarily, as the classic staging area is,” Hooper explains.

“Data lakes also have compute power and other tools, so they can be used to analyse raw data to identify trends and anomalies.

“Furthermore, data lakes can store semi-structured and unstructured data, along with big data.”

Using Hadoop as an enterprise data warehouse staging area is not a new concept, says Hooper.

“A data lake based on Hadoop not only provides far more flexible storage and compute power, but it is also an economically different model that can save businesses money,” he says.

In addition, a data lake provides a cost-effective, extensible platform for building more sandboxes, which are testing environments designed to isolate and execute untested code, Hooper explains.

“A Hadoop staging approach begins to solve a number of the problems with traditional enterprise data warehouse architecture, while full-blown data lakes have created an entirely new data warehouse model that is more agile, more cost-effective, and provides companies with a greater ability to leverage successful experiments across the enterprise, resulting in a greater return on data investment,” he says.

TCS collaborates with Red Hat to build digital transformation solutions
“By leveraging TCS' technology skills to build more secure, intelligent and responsive solutions, we aim to deliver superior end-user experiences."
Twitter suspects state-sponsored ties to support forum breach
One of Twitter’s support forums was hit by a data breach that may have ties to a state-sponsored attack, however users' personal data was exposed.
How McAfee aims to curb enterprise data loss
McAfee DLP aims to help safeguard intellectual property and ensure compliance by protecting sensitive data.
HPE promotes 'circular economy' for end-of-use tech
HPE is planning to show businesses worldwide that throwing old tech and assets into landfill is not the best option when it comes to end-of-use disposal.
2018 sees 1,500% increase in coinmining malware - report
This issue will only continue to grow as IoT forms the foundation of connected devices and smart city grids.
CSPs ‘not capable enough’ to meet 5G demands of end-users
A new study from Gartner produced some startling findings, including the lack of readiness of communications service providers (CSPs).
Oracle announces a new set of cloud-native managed services
"Developers should have the flexibility to build and deploy their applications anywhere they choose without the threat of cloud vendor lock-in.”
How AT&T aims to help businesses recover faster from a disaster
"Companies need to be able to recover and continue operations ASAP, without pulling resources from other places to get back up and running."