Data Mining and Predictive Analytics          SCL Home
Consulting Offerings

Created: June 14, 2009; Last Updated: June 14, 2009

Overview

Data Mining and Predictive analytics are the next steps towards intelligent systems. Until now (June 2009), Business Intelligence systems based on the Microsoft BI stack have focused on OLAP and Performance Management. While browsing OLAP cubes using OLAP browsers such as ProClarity and the Excel Pivot table are very useful, it isn't much beyond ad-hoc reporting. The KPI monitored through Performance Management Dashboards and Scorecards point to the pain, but not much beyond that as to what to do about it and how. Data Mining and Predictive Analytics generate statistics-based rules from a mountain of data from which we can better figure out what to do about pain.

"Predictive Analytics" is the application of data mining technologies towards the goal of predicting or inferring attributes of things or events related to a business. Such predictions are incorporated into business decisions. The SQL Server Analysis Services (2005 and 2008) data mining features are severely underutilized in the SQL Server customer base. Currently, data mining is employed in a BI implementation in a supporting, fringe, or niche role such as providing forecasts in PPS Planning. However, Data Mining encapsulates the whole point of Business Intelligence (BI) and should play a role at least on par with other BI concepts such as ETL, Data Warehousing, and OLAP.

Successful implementation of Predictive Analytics will go a long way towards fulfilling, if not at times completely fulfilling, the hype around Business Intelligence. Without Predictive Analytics, a BI implementation often devolves into "glorified reporting". The great value of BI is that it is supposed to reveal things about a business in a dynamic environment that isn’t already known or at least helping to confirm hard-to-prove suspicions. Achieving that functionality results in the justification of Business Intelligence as an ongoing process, not just a one-time project.

Please see my blog, Why isn't Predictive Analytics a Big Thing?, for a deeper discussion on this topic.

Problem Addressed by Predictive Analytics

The incorporation of Predictive Analytics into an existing BI solution will maximize the value of the solution, as it is a key component for elevating the solution to "Strategic Asset" status in BPIO terms. For the sake of argument, consider that if the ultimate purpose of BI is to make better decisions in a timely fashion, the implementation of a data warehouse gets you 75% there; data is integrated and there is a sufficient history of facts for detecting reliable patterns of various sorts. An OLAP cube can be thought of as a "Data Warehouse Accelerator" getting you to about 90% of the way there with snappy, almost real-time, responses to queries. Then that last 10%, which is really ends up being 90%, is to discover patterns, distributions, associations, etc. That can be done manually using a highly skilled analyst or semi-automatically utilizing Data Mining.

The incorporation of Predictive Analytics automates much of what a human analyst does manually with an OLAP cube browser such as ProClarity Desktop or Excel Pivot Tables. That is to identify patterns such as correlations of events and identifying subsets of a data set. However, in most cases at the scale of the "enterprise" level, the amount of data examined, even after all that "distillation" to the point of having OLAP cubes, leaves much room for error (false positives) or oversight (false negatives) by the human analysts. Examples of basic Predictive Analytics Scenarios include, but are certainly not limited to:

Insights gleaned from Predictive Analytics puts the BI solution on the business' offensive team, which means it is truly a strategic asset.

However, many will say that currently well-embraced BI technologies such as OLAP and Performance Management are still not fully assimilated into daily business operations and thus the market isn’t ready for the full utilization of Predictive Analytics. This would be especially true in Microsoft’s broad SMB market, where highly distilled knowledge can still be contained within the brains of information workers. There, Data Mining is viewed as advanced and niche.

Counter-intuitively, I say the full utilization of Predictive Analytics (and other applications of Data Mining) is not really an advanced BI concept, but is really the point of Business Intelligence, where ETL, data warehousing, and OLAP are just the setup. Embracing data mining as a normal, not advanced or niche, part of a BI implementation will go a long way towards increasing the accessibility of BI-produced information. It does this by filling in the chasm between the OLAP cube and the combination of OLAP browsers and the human intelligence of the analyst.

Predictive Analytics will drastically reduce the burden on the human intelligence of the analyst allowing her to focus her high-end analytical training on only the most intricate issues of complex decisions. Or, equally as compelling, Predictive Analytics can bring real BI to the masses, providing sufficient assistance to a larger number of analysts with not-so-high-end analytical training to make high-quality juicy decisions. This notion is what will bring the term, Decision Support System (DSS), back to the forefront of the BI market.

Offering: Data Mining Ramp Up

This workshop could also be offered in combination with other BI-related workshops such as for PPS, SSAS and SSRS.

A sample syllabus would be as follows:

Prerequisites:

Pre-session training:

1 day on an Introduction

2 days on scenario-based instruction-led labs on data mining

1 day on advanced challenges:

1 “Making it Real Day" – A day-long Instructor-led lab around a sample project

Offering: Data Mining Proof of Concept

Data Mining and Predictive Analytics are still surprisingly fringe. It's a good idea to dip your toes into the DM/PA world with a real-world proof-of-concept. Most POCs are tests to ensure a technology is likely to deliver what it claims to deliver. Unlike most POCs, this POC will result in a real application of high value. The proof of concept would focus on well-known and fairly simple techniques such as market basket analysis or target direct marketing.

 

Please email Eugene Asahara at eugene@softcodedlogic.com to discuss this workshop further.

 

Related Articles: