Article by Steve Singer, Talend A/NZ Country Manager
Google has been using machine learning to improve its business analytics for years and judging from all the recent excitement about the technology, you’d assume enterprises everywhere would be following suit. Yet, according to some studies, only 22 percent of companies are already using machine learning’s analytical power by implementing algorithms in their data management platforms.
So why aren’t more organisations taking advantage of the science that can make them “Google smart”? Probably because they believe it’s too complicated. They might think their data is too voluminous or unreliable, might consider the required data preparation too time intensive or might assume only data scientists have the skill set needed to leverage machine learning. While those concerns may have held true a few years ago, today, all those objections can be overcome.
The Renaissance of Machine Learning
In the past, using machine learning algorithms was complex, and the outcomes could be perplexing and unpredictable – it was difficult to understand how the technology classified data, so you were never sure what type of results you might get. However, the rapid adoption of the cloud and new technology tools have combined to help simplify machine learning and make it more accessible to a broader base of IT professionals.
Machine learning is data-driven, which means you need a lot of data to make it work. Furthermore, machine learning requires a lot of computational power, particularly to learn models. Fortunately, the cloud is ideally suited to deliver on these requirements and can also play an integral role in simplifying the use of machine learning and making the technology more affordable and manageable.
Additionally, there are also now a variety of commercially available tools that lower the barrier of entry and the complexity of machine learning while still working natively with the languages and frameworks used by the technology in the cloud.
For example, recently introduced drag and drop components give line-of-business (LOB) developers and power users the tools to complete many of the tasks needed to leverage machine learning’s power without the complex coding including:
Providing pre-packaged math and statistics ‘routines’, e.g. boilerplate code which previously could only have been created by statisticians
Assistance in preparing or “featuring” data (for instance, many statistical models only work with numeric values, so these tools help by converting strings into numbers with numbering schemes or assigned ranges, etc.), and
Help training and validating models (a process that previously required deep knowledge in methodology and math).
Machine Learning’s Many Advantages
Of course, while machine learning has a considerable amount of raw power that can now be used more easily for data insights, we are not yet at a point where we can simply plug in data and get instant results. Raw data still needs cleansing to weed out flaws and exceptions, because until trained, machine learning can’t recognise them.
While machine learning technology is still being trained, the most effective way to employ it is to combine it with the human expertise that can recognise data issues and exceptions that machines can’t. This approach ensures that organisations take full advantage of machine learning's’ unique prowess: the ability to become smarter over time.
In fact, to gain a true competitive edge with machine learning, it’s essential to put some control into the hands of business users – the people who really know the data and can bring their human insights to bear.
Combining business users’ insights with machine learning will allow companies to quickly leap forward in their analytics. Once they begin the process of making their machine learning smarter, they’ll start to experience exponential gains. Conversely, companies that aren’t employing machine learning will find themselves miles behind those that have invested the human equity.
Once the complexity associated with using machine learning has been stripped away, and companies have deployed resources to train the technology, a much broader base of business and IT professionals can leverage it to analyse larger data volumes and more data types than ever before. Businesses can now more easily employ machine learning for operational insights, to uncover patterns humans can’t find and to look at countless data variables.
Massive volumes of data can now be more easily deployed to explore operational analytics such as:
Which products are likely to be bought together? (Collaborative filtering)
Will an event happen in the future? (Classification)
How much, what will be the number of…? (Regression)
Who are my gold customers? (Clustering)
What will be the price of this stock in a month? (Gradient boosted tree)
Is fraud occurring? (Decision tree)
Is that image a known intruder? SVM (supervised learning)
Of course, it goes without saying that machine learning is only as good as the data you feed it, so those using it need to create a solid data governance plan. Machine learning can do a great job of automating good decisions, but it can also dramatically increase the negative impact of bad decisions made using inaccurate data. In fact, data can even be used to intentionally infer alternative facts. Establishing guardrails against these potentially negative impacts is a must, particularly in light of growing data privacy concerns around the world.
It’s not as far away as you think
Thanks to advances in technology, infrastructure frameworks, and growing cloud adoption, companies no longer need to rely exclusively on data scientists to reap the benefits of machine learning they just need the right tools – and then, they too can become “Google smart”.