Data warehouse vs data mart
Data warehouse is a general term to describe a system used in an organisation to collect data from one or more data sources (including the transactional system) into a central data location and later report that data, generally in an aggregated way, to the business user.
The question which is common to every organisation is ‘Do we need a single, organisation-wide data warehouse, or are the information-intensive domain specific data marts sufficient?’. Data marts are often an attractive alternative to the mammoth job of implementing an enterprise-wide data warehouse.
While a data warehouse is an organisational level repository, which is a combination of various data marts, data mart's data is targeted to a smaller audience of business users. Typically an organisation's business intelligence agenda starts with few data marts, before maturing to a full-blown data warehouse.
Data warehouse:
• Holds multiple domain areas
• Holds very detailed information
• Integrates all data sources
Data mart:
• Holds only one domain area – for example sales, customer, product
• Normally holds summarised data (some may be very detailed)
• Usually feeds from one or multiple sources related to the domain
Advantages of a data warehouse
The concept of data warehousing is pretty easy to understand but a data warehouse also costs money. The problem is when big money is involved it’s harder to justify the expense, especially when you can’t really quantify the benefits upfront. With a data warehouse, it’s not easy to know what the benefits are until it’s up and running.
But there are some advantages to a data warehouse:
• Data warehouse generates high ROI
• Data warehouse promotes and enhances data consistency and quality
• Data warehouse is a prerequisite for business intelligence
• Common data dictionaries and information schemas
However there are some disadvantages of data warehouse:
• Data is static and dated
• Data ownership is complicated as data owners lose control over their data
• Longer implementation time
• Data could change meaning during conversion
• Maintaining security and privacy is complicated.
What is not a data warehouse?
A data warehouse typically does not provide real time information, as mostly the data extraction, data manipulation and consolidation happens at the end of day or at regular intervals. A data warehouse is also not business intelligence. It is a repository of sanitised and consolidated data, which can be used for any purpose including business intelligence. The data can be used for trend reports, transaction reports, data mining, data analysis, forecasting etc. It is not very uncommon for organisation to have a good data warehouse without good business intelligence.
Once your data marts and data warehouses are created, business intelligence tools will allow you to create dashboards, reporting systems and mobile apps to deliver the right information to the right people at the right time.
Any organisation which expects better decision making capabilities, competitive advantage and connecting with their customer, simply by starting a data warehouse project is in for a surprise. The real benefits are achieved by understanding their current business practices and then to change their business practices to best use the data warehouse. And that can be harder than implementing a data warehouse.
Jayesh Jain | CBAP, CSPO, agilist and enterprise business analyst
Jayesh Jain works in information technology and services and was recently appointed as vice president, membership, for IIBA’s (International Institute of Business Analysts) New Zealand chapter.