After spying on myself to find my secrets before the NSA, I thought I would try to apply similar machine learning technology to headlines. Online publishers have long known that a good headline can have a big impact on web traffic, which is why the best editors I know are the ones who pride themselves on their headline writing. But editors are still human, and so the most savvy outlets ask their editors to write multiple headlines and test them against each other, mostly using social networks.
This style of testing works well and can result in significantly higher click-through rates, but it’s still only as good as the editor writing the two headlines to test against each other. I’m more interested in seeing if we can use a statistically driven tool to optimize headlines before we ever publish the article, and conduct A/B testing on top of that optimization. To do this, I’m planning to feed Fast Company’s trove of analytics data to machine learning algorithms to try to predict how many pageviews an article is likely to get based only on its headline.
To be blunt, I don’t know if it’s even possible to do this accurately and easily. I suspect it isn’t. The main problem is that headlines are far from the only thing that influences the number of pageviews an article sees. I suspect that the quality of an article and its newsworthiness has a lot to do with it, too. I also haven’t been able to find other people or companies trying to do this, which isn’t a terribly good sign.
Despite these concerns, I put together a quick model, again using the Google Prediction API, as a proof of concept. I fed the model an entire year’s worth of headlines and associated unique pageviews from Google Analytics, and asked it to run a regression model against it. It returned a model with a mean squared error of 7,323,430, meaning that any prediction it gives me could be off by as much as 2,706 pageviews in either direction. That’s utterly useless for us, because it means the error range of most predictions will be in the same ballpark as the total number of unique pageviews which some Co.Labs articles are likely to get.
But merely having something interactive to play with was insightful. Besides being inaccurate, I found that plugging lots of full headline ideas into a box wasn’t very useful because you get into the time-wasting trap of trying to beat yourself by a couple of hits with a new configuration of a very similar headline. Instead, what editors need is something that simply tells them that a headline is good enough—maybe in the top quartile or so—similar to a password difficulty analyzer you see when signing up for a new social network. Systems like these do exist, but they’re mostly based on optimizing headlines for search engines.
What’s the next step? Convert my model from regression analysis to Bayesian classification and convert unique pageviews from raw numbers into performance quartiles. I’ll let you know how it goes—and ping me on Twitter @gabestein if you have ideas, rants, or suggestions.
[Image: Flickr user Garry Knight]