Predictive Analytics Advice                    SCL Home

Thoughts for the Day

Created by Eugene Asahara
Created: 01/01/2011
Last Update: 01/09/2011

Overview  

This is an evolving document of mini-blogs that lists quick pieces of advice relating to Predictive Analytics. At least for my contributions, these items reflect things I've realized over the years that helped me to wade through the task of asking the toughest things you can ask of a computer: Help me make better decisions with educated guesses. I still keep these in mind as I trudge through the chaos associated with making sense of open systems. My intent is to have these mini-blogs easily consumable from one place, like those fun books of very short topics you read before falling asleep (ex: The Straight Dope, Cecil Adams). Or maybe just a thought for the day before diving into the day.

Many of these posts will perhaps someday become a full-blown article or blog. Since full-blown blogs take many hours, this forum provides a way share ideas quickly.

If you have data mining or predictive analytics tips that have helped you through the chaos, please feel free to submit them to me at eugene@softcodedlogic.com, along with a subject line, text of around 250 characters, and optionally your name, a web site and/or other contact information - if you'd like me to add a link to you and properly credit you for the contribution.

The Two Major Modes of Predictive Analytics

Added: 01/02/2011  Last Update: 01/09/2011  Contributed by: Eugene Asahara

I look at PA in the modes of the hunter in us and of the mass-production of the Industrial Age. When we hunt, we must consider our strategy, the tactics we will employ, and our actual execution.

For hunting, we know that the goal is to kill an animal for food and not be killed ourselves. We plan out a strategy that can be concising phrased as "focus on the young and the old". That means we realize that the young and old will be slower and not as able to defend itself. That is a prediction we make through years of observation. We've recorded in our brains that the young and old gazelles don't seem as fast, even though there are exceptions. Tactically, we need weapons. There are predictions there as well. How do we select our weapons? We may need to predict where the gazelles may be in order to know how much supplies to carry. Operationally, we need to predict whether there are other predators lurking in the bushes who would find us as tasty and be happy with nabbing us as well.

Humans execute on these patterns in pretty much everything we do. Everything we do is towards a goal, requires planning, preparation, then execution. There are factors of imperfect information at every step.

The mode of Predictive Analytics for the Industrial Revolution is mass production at a "good enough" level. For example, targeting direct mail campaigns is a mass production of predictions of the most likely to respond to the ad. The predictions may or may not be better than what humans could do given enough time. But at least the Predictive Analytics application can make thousands or millions of good enough predictions in a very short amount of time. See my slide on low-hanging fruit predictive analytics.

PA is Not Just About What We Didn't Already Know

Added: 01/02/2011  Last Update: 01/02/2011  Contributed by: Eugene Asahara

People are often disappointed when the Predictive Analytics models only tell us what we already knew. Discovering things we didn't already know is certainly the most compelling aspect of Predictive Analytics. However, it also measures how much each factor weighs in a prediction. It also validates what we may suspect or tests that assumptions still hold true.

How Quickly Things Become Impossibly Complex

Added: 01/02/2011  Last Update: 01/02/2011  Contributed by: Eugene Asahara

Many summers ago (1995) as Laurie and I drove a U-haul with an auto transport trailer from New York to Silicon Valley, we met a truck driver named Klaus at a McDonalds in Lincoln, NE. Laurie and I were standing in line for our Egg McMuffins when I noticed Klaus and we kind of started sizing each other up. I don’t recall how our conversation started, but we somehow recognized each other as fellow judoka. I suppose it’s the stocky build we share.

After a great conversation about judo, he pointed out his rig in the parking lot. It had three trailers like the one in this photo. Having just had the toughest time backing up our U-haul with the one trailer, I asked how one backs up such a thing. We had a tough enough time with our one little trailer. He said that you don’t. If you’re stuck and need to back up, you would need to unhitch the trailers.

It took Laurie and I about three tries totaling half an hour to learn how to back up our "rig". Klaus said it takes anywhere from a couple hours to a couple of days to learn to back up a real single trailer rig. For two trailers, a trucker with years of experience and practice can possibly back it into an alley. Three trailers is just about impossible.

I use this story in practically every talk I give involving complexity. It’s very effective at showing how quickly real life becomes complicated beyond our control. I don’t know how true it really is, but I imagine it’s pretty close to the reality. If you’re still out there, Klaus, there are a couple hundred people that know who you are. For me, this story is a constant reminder of the value of predictive analytics and why the human brain evolved to handle impossible complexity as opposed to perfect execution of set rules.

We're All Natural Data Miners

Added: 01/01/2011  Last Update: 01/01/2011  Contributed by: Eugene Asahara

Predicting is what we all do every instant of the day. Consciously or subconsciously we're always predicting what comes next. Our brain is constantly recognizing countless things around us and in massively parallel fashion executes countless "if this is there and that is there and that other thing is there" rules. What may make PREDICTIVE ANALYTICS seem so threatening to most is the notion of dealing with the highly technical domains of databases and statistics.

Don't be intimidated by the highly technical aspects. All Information Workers, any worker who relies on information to make daily decisions, are Natural Data Miners. Unless you're an IT person, there isn't really a need for a deep understanding of databases and statistics.

Here are three examples of everyday implementations of data mining algorithms:

References:

Analysis Services Data Mining Mental Blocks

Be Cognizant of the Goals of the System

Added: 01/01/2011  Last Update: 01/01/2011  Contributed by: Eugene Asahara

Predictive Analytics isn't something we just implement. It is assistance to the solution of a problem. If provides "educated" best guesses for pieces of information that help to make a decision with the best chance of being correct.

Be cognizant of the goals throughout the predictive analytics development cycle, but also balance theory and discovery.

Ultimately, predictive analytics should be implemented within a Performance Management solution as I describe in my blog, Bridging Predictive Analytics and Performance Management.

Learn Systems Thinking for Wide-ranging Business Acumen

Added: 01/01/2011  Last Update: 01/01/2011  Contributed by: Eugene Asahara

Business Acumen is one of the three major components of a full-service predictive analytics consultant. This is because the predictive analytics consultant bridges business strategy with the all-encompassing view provided by a foundational BI system by using data mining techniques.

Since predictive analytics is tied to strategy (something that should be intimate to the business), a deep understanding of the market sector is crucial. But how can we be so intimately knowledgeable of so many sectors? The best way is to understand systems thinking. Every single business process from making coffee to manufacturing the products is a system. If we understand systems, where they may break (bottlenecks and leaks), we've built a solid foundation from which to understand virutally any business.

Many aspiring predictive analytics consultants will come from the ranks of database folks. For those people, think about the process of troubleshooting query performance. Where are there queues - the bottlenecks? What wasted effort is there (ex: recompiling stored procedures unnecessarily)? We all know something where we're very familiar with the workflow and cause and effect of the system.

References: Systems Thinking and Imperfect Information, The Fifth Discipline.

Predictive Analytics is an Iterative Learning Process

Added: 01/01/2011  Last Update: 01/01/2011  Contributed by: Eugene Asahara

We turn to predictive analytics because we're faced with fuzzy situations where we may not be doing better than taking shots in the dark. With Predictive Analytics we hope to learn about our systems when the level of imperfect information rises above our heads.

This is important to keep in mind during the early stages of a project (envisioning and planning). Because software developers are so used to creating systems that automate well-defined processes, the tendency is towards the need to completely control what happens. In such cases, the project gets stuck in design mode until we've painstakingly figured out what the Predictive Analytics solution was supposed to help us figure out.

The key to building a long-lasting Predictive Analytics solution is to lay a malleable foundation from which we can apply what we learn from iteration to iteration.