Data Mining in the PerformancePoint "MAP"
Framework
For the last couple of weeks I've had some fun working on a rare (at least
for me) data mining engagement. Such engagements are rare for me since the team
I'm on focuses on PerformancePoint Server 2007 (PPS), which at this time does
incorporate data mining in the PPS Planning application, but it certainly isn't
a big part of the overall "MAP" (Monitor, Analyze, Plan) framework. The data
mining functionality of Analysis Services 2005 is still currently viewed as a
fringe (or very advanced) component of the Microsoft BI Stack by many
practitioners. That's a shame since planning involves more than the
business-centric forecasting and trending in the current PPS Planning
application.
Data Mining offers for those with a strong business and accounting
background, but not a strong IT background (such as accountants and especially
the MBA types) who hope to consult on PPS implementations an angle by which they
can provide high-end value. By "high-end value" I mean the sort of skills in the
PPS world that take years to truly master such as the enterprise-class
implementations of SQL Server 2005 Analysis Services (SSAS) and Microsoft Office
SharePoint Server (MOSS) that will involve issues beyond what rote "best
practices" white papers cover. (And I facetiously say, "For which you make the
big bucks.") This is as opposed to the relatively easily obtained skill of
learning how to build dashboards with the PPS Dashboard Designer or designing
Excel BI reports.
Planning is about
developing a strategy to resolve a problem. It is about attempting to see
into the future, which is tougher than analyzing the past.
The Monitoring and Analytics (M&A) pieces of the PPS MAP framework have
existed for a few years in their former incarnations roughly as the "Business
Scorecard Manager" and ProClarity. The Planning piece is brand new. Planning
is the toughest (the "P" is MAP) piece to tackle from a software developer's
point of view since planning (in the complete sense) is the most complex,
variable activity. Therefore, the capabilities of the current PPS Planning
application are kind of limited in scope to the semi-defined confines of
financial budgeting and financial forecasting.
Keeping an eye out for something that may be wrong (Monitoring) and then
trying to figure out what is wrong (Analyzing) are simpler tasks than trying to
figure out what to do about it (Plan). Figuring out what to do about a problem
goes well beyond the financed-based capabilities of the PPS Planning
application. For example, forecasting the sales of a newly developed product
involves many steps and many insightful thought. Forecasting the sales of an
existing product is feasible with enough history. But I would think a new
product is unique in some way and what makes it new is based on a unique theory.
One would start with determining the target customer segment using clustering,
products that could induce similar behavior using the association algorithm,
then using that information to infer a a sales forecast.
Planning can be for issues within closed or open systems. I bring up this
topic because the engagement I was on involved a forecast in a rather open
system, so I wish to offer this differentiation. Closed systems are much easier
to work with as variables are tightly controlled. Closed systems are finite,
therefore mapping relationships of aspects of the closed system is feasible.
Examples of closed systems include manufacturing plants and most machinery such
as cars. However, closed systems still don't live in a vacuum, so they are
generally encased within systems protecting the consistency of the variables.
Think about all the systems that protect a car's systems from the vagaries of
the outside world giving the closed system the illusion that it is truly closed.
For example, the radiator keeps the engine at a consistent temperature.
Open systems include the weather, environmental impact, and worst of all,
customer behavior (which is the open system related to this engagement of which
I speak). These are systems out in the wild with few or no systems protecting
any mechanisms. Rules revealed through data mining will probably be different
under even slightly different circumstances. Relationships would be virtually
impossible to map and manage as they would be too numerous and complex. (I often
tell this story to illustrate the impossible complexity of the world:
How Quickly Things Become Impossibly Complex)
Now
think about how complex the forecasting for a new product would be since results
are ultimately based on customers actually ending up buying a product. In other
words, it's based on customer behavior. How many factors can you think of that
affect customer behavior? How many steps are required from the release of an
advertisement to the actual purchase?
The big difference a planner will face between open and closed systems is
that planning for open systems involve a high degree of "art". By "art" I mean
that there is a high degree of versatile human intelligence (as opposed to the
deterministic intelligence of most current software) required to decipher a
complex system. It's unfair to use the phrase, "more art than science", in the
contemptuous manner in which I hear it uttered. One must take care to
differentiate art from chaos or superstition.
"Art" is actually the epitome of human intelligence. Art should be thought
of as something non-deterministic that cannot be automated by computer or robot.
Putting Jackson Pollack aside, all works of art (in the usual sense of "art")
require a great deal of technical skill.
Generally there is less art involved with closed systems. Planning within
closed systems can generally follow well-known techniques ... "best practices".
There may be things that are hard to see that data mining can reveal. However,
once the rules are revealed, they change only when the closed system changes,
which is usually in a disciplined, controlled manner.
Planning open systems where there is unimaginable complexity and a lack of
control is where versatile intelligence is required. There are at least four
major categories of skill one must develop in order to plan in open systems:
Systems thinking will
probably be the toughest since the concepts are sort of fuzzy and not
deterministic as we've come to expect from the "scientific", best practices way
of thinking. My personal blog site lists my favorite "non-traditional"
performance management books that speak to this topic of "data mining is
planning" and systems thinking:
Non-Traditional PM Books If I had to pick one, I would pick
The Fifth Discipline, by Peter M. Senge.
Ultimately, when the install and
troubleshooting applications for SSAS and MOSS reach a higher level of maturity
(the tasks can be handled as a rote, automated manner - a commoditized,
outsourcable skill), data-mining-based planning skill will be what's left in the
MSFT Performance Management world that would consist primarily of the
versatility of human intelligence and thus reward the practitioner as a high-end
skill. A good read on this topic is Super Crunchers.