Pouring the Predictive Analytics Foundation

Famed Russian novelist Vladimir Nabokov once said, “There is no science without fancy and no art without fact.” This brilliant quote sums up the unique experience of developing predictive analytics models — which involves equal parts art and science – and just a tiny bit of guess work. Steve Kerho guides brands through the first stage of building predictive analytics models.

Famed Russian novelist Vladimir Nabokov once said, “There is no science without fancy and no art without fact.”  This brilliant quote sums up the unique experience of developing predictive analytics models — which involves equal parts art and science – and just a tiny bit of guess work.


Many brands have trouble getting to the first stage of building predictive analytics models.  But I repeat the old adage to them, “you don’t know where you are going until you know where you have been.”  In order to predict future success in the marketing world, you need a window into the past combined with a large set of current behavioral data.  And this window on the past, my friends, is realized through the availability of accurate historical data and the results of their marketing and media campaigns.

Unfortunately, many brands have difficulty gathering historical data and media metrics – largely because the multiple marketing agencies they work with don’t take the time to organize the data appropriately or in a useful format.  Therefore, many brands don’t know how they’ve spent their marketing budgets over the years or how they have performed against granular objectives.  Brand marketing teams need to make sure that they gather at least three years worth of their media metrics and performance data in order to build effective predictive models.  And it will be important to break down the barriers that often exist between various agencies and the client to obtain a holistic data set from all parties.   To be truthful, the data the team initially receives is not going to be perfect. It’s completely okay to make assumptions or to take artistic liberties based on your current data when developing a predictive analytic system.  Acknowledge the flaws in your data and work to improve data collection for the future.  Don’t let spotty data stop you from ever getting started.

This seems to be a good time to emphasize how predictive analytics differs from measurement.  Since both measurement models and predictive models rely on historic data, many people assume they are essentially the same thing.  While they are definitely related, they are more siblings than clones.  They may be composed of the same DNA, but their lives have different goals and different drives.  It is important to understand these differences before setting up predictive analytics.

The first is that predictive analytics rely on timeliness. Measurement models, like media mix models for example, often present old data from marketing campaigns that were completed over a year ago. Predictive models should be launched before the campaign starts or one month into the annual campaign so marketers can take advantage of real-time digital behavioral data to interpret information and react before the campaign is over.

Second, predictive models should be straightforward.  In measurement models complexity is key.  When we examine the past we want to account for all of the nuances that occurred in order to have the cleanest read of what took place.  We want to remove the impact of the bad press, the industry award, or any other event that may have influenced people’s reception to the brand.  That way we get closer to the true impact of the marketing.  In predictive models we want to focus on things we can plan and forecast.  It’s important to realize that the assumptions you make in order to forecast could alter the accuracy of the predictive data. For instance, when you look at social media or anything that has the potential for a huge viral spike, you can learn a lot by looking backwards to see what contributed to success.   But it is much more difficult to predict what will catch fire until it does.   If you introduce variables into your model, like the tone of social media conversations, then you need to be able to forecast those variables.  Overly complex forecasting models can result in the need to forecast scores of variables- even before you forecast the variable of interest. 

Consider, for instance, if you created a predictive sales forecast that assumed certain levels of myspace visits for your brand in 2009.  Well, we now know that myspace visits decreased throughout 2009.  In order for your model to provide reliable forecasts, you would have needed to have forecasted myspace’s decline.  While this wasn’t an insurmountable task, it adds one more point of potential error in your forecast.  In a measurement model that is not a problem, we know what happened.  In predictive modeling we need to stop and ask ourselves, does the value of adding this additional term outweigh the potential for error?


The best way to prevent this type of inaccuracy is to look at the type of data used within the model. First, marketers would be well advised to not rely solely on survey data; instead, marketers should look to utilize digital behavioral data as that information is constantly available and provides an accurate representation of how customers are acting online without any assumptions or biases.  These online behavioral data sets are very often a measure of the total demand that the marketing enterprise is generating.  And this data greatly adds to the accuracy of predictive forecasting models.  We will discuss some of the ways to incorporate this data in an upcoming post.

The last point I want you to consider today is that since change is the only constant, the predictive analytics models should be treated as living, breathing entities that need constant care and feeding. Without this care and attention they will simply be outpaced by the current marketplace and will lose their value.  Since many brands are steeped in the measurement mindset, they don’t want their numbers to change.  If Q3 of 2009 brought in ten million in sales then that is THE number.  The measurement doesn’t change.  However, if we predict that we will sell eleven million units in Q2 2011, but unemployment continues to rise in early 2010, contrary to our expectations, then the forecast changes.  Eleven million is no longer the number, now 9 million is the number.  In predictive modeling we no longer have THE number.  And that’s ok.  Actually it’s better than ok, because our prediction is better than it was before.

 Forecasts constantly change and will become more accurate as marketers refine their assumptions and become more comfortable with how predictive models work.  Measurement without optimization is pointless so marketers will need to stay on their toes and ensure their data practices do not become stale. Forecasts for 2011 and 2012 will change based on data that is brought in throughout the year but that does not make them any less valuable.   It’s good that your forecasts change.  It means you are learning. 

Brands need to change their mindsets around shifting forecasts because marketing does not happen in a vacuum. Major economic changes could occur for any number of reasons including natural disasters, war and fluctuations in the trading markets.  Just because numbers are not constant, does not make them any less accurate. Just remember this: you should continually optimize your brand’s models, refine the assumptions used to forecast outcomes and trust in your data to boldly succeed in today’s ever-changing marketing world.

Now that we’ve discussed the high level differences between measurement and prediction, we will get into the nuts and bolts of a predictive model over the next few posts.  Some people will tell you that predictive models are nothing more than a regression model.  While that’s true in one sense, it also true that War and Peace is just a book.  Rather than brushing over the details of predictive modeling we will tackle issues including linearity, interaction, saturation points, media decay and observations in time which show how predictive models are more than ‘just regression models’.  I will show you how predictive models can be an ever changing toolkit that adjust to your business and deliver the insights you care about most.  

You will also see that these various statistical treatments, while intimidating to the layperson, are quite manageable with the right team in place.  Remember it takes a village to deliver on the promise of predictive modeling so don’t get intimidated if things get ‘quant geeky’ for a while, it will all come back to the business insights in the end.  So stay tuned my friends, and before you know it you will be discussing media inflection points AND what that means to your business.


Steve Kerho is the SVP, Analytics, Media and Marketing Optimization at Organic (


About the author

Steve has over 24 years of agency and client side experience leading CRM, interactive marketing, sales and media practices for brands including Nissan, Bank of America, Visa and Procter & Gamble, to name a few. In 2011, he was named an Adweek Media-All Star for his innovative work measuring earned and owned media content and developing predictive analytics models to optimize digital ecosystems