The inherent flaws in big data analysis
Business intelligence and big data programs are being hailed as a fast way to provide the insights that underpin business decisions, as well as planning and performance.
Many organisations will be aware that enterprise data warehouses or big data platforms such as Hadoop and DFS hold all the data. Intelligence tools then draw on that data to create visualisations, dashboards, reports, and drilldowns.
Data integration technologies like ETL also move organisations' systems or record into a data warehouse or big data platform.
But there is a major flaw inherent in this approach, according to analytics software provider TIBCO: A few people neglect the rationale behind managing enterprise dimensions, attributes and hierarchies – if ignored, all of these result in erroneous analysis.
Take enterprise dimension management, for example. One of the most common challenges is how to deal with dimensions that don't conform, generally because underlying systems of record define or use dimensions in different ways; or the common dimensions are missing necessary classifications and attributes.
Say that sales and marketing want to understand how effective an organisation's industry-focused marketing programs have been at generating new opportunities. Analysts need to be able to connect the industry-coded events to your prospects and closed deals.
All those dimensions may have an industry attribute, but marketing is using a public standard such as standard industry codes (SICs), while sales is using an internally created industry classification scheme.
Data governance enthusiasts may advocate for standardisation but it's not always so clear-cut.
TIBCO explains that the marketing team may be using SIC because that's how all their vendors segment audiences. While the sales team may have decided to define their own classification because SIC was too complex.
Those kinds of issues create uncertainties about systems of record and whether they provide all of the attributes an organisation requires.
"It may seem like the simplest solution is to extend and manage attributes in the primary system of record, but that won't work for your users," explains TIBCO. "Depending on the context, they may need different attribute values altogether.
Attributes such as markets, segments, and sub-segments reflect a product's current segmentation – at least in a performance context. But if brand management is involved in planning and analysis, those attributes may hold different values.
"This illustrates the need for a separate system that business users can utilise to enrich dimensions before they're loaded into the EDW or big data store. One key requirement will be a mechanism to support the planning and analysis contexts with not just new attributes, but multiple alternate attribute value sets and temporary extensions.
To manage dimension and hierarchy management, TIBCO recommends that organisations look for following features in a solution:
- Tools to define dimensions, attributes and hierarchies
- Support for all types: derived, explicit, balanced, and unbalanced
- Services to balance and link hierarchies
- Versioning to manage past, present, and future dimensions and hierarchies
- Inheritance to manage alternate hierarchies
- User interfaces to manage dimension, attributes, and hierarchies
- No-code, browser-based UIs
- Built-in full text and fuzzy searching and filtering
- Perspectives and custom layout to configure role-specific UIs
- Services for integration and distribution to downstream systems, ad-hoc analysis tools, data warehouses, and big data platforms
Software such as TIBCO's EBX is able to help organisations manage dimensions and hierarchies, without the headaches of erroneous analytics.
TIBCO EBX also offers more capabilities, including collaborative workflow to support government and change management; a data quality engine; fine-grained security and permissions; and business rules to enforce controls and validation.
To learn more about enterprise dimension and hierarchy challenges, TIBCO offers a free whitepaper that explains common challenges such as those outlined above, as well as possible ways to solve them.