As web publishers we're obsessed with analytics. Any strategy that moves the needle gets our attention. So you can imagine my surprise when a new way of writing articles blew up my assumptions about how to drive traffic.
Before I go on—looking at analytics is tricky. Because we have imperfect information about our audience, it can be difficult to translate aggregate reader behavior into real insights. We may see trends or correlations in the data, but they do not necessarily imply causation, and trying to analyze traffic too closely can coax editors and writers into an unhealthy diet of red herrings.
Sometimes, though, a trend is so big and consistent that it's impossible not to engage in a little educated speculation. The last month of our traffic at FastCo.Labs has presented is with one such scenario.
In mid-April, we went live with a half dozen articles which we call "stubs." The idea here is to plant a flag in a story right away with a short post—a "stub"—and then build the article as the story develops over time, rather than just cranking out short, discrete posts every time something new breaks. One of our writers refers to this aptly as a "slow live blog."
Stub stories work like this: You write the first installment like any other story. But when more news breaks, you go back to the article, insert an update at the top, and change the headline and subheadline (known in our business as the "hed" and "dek") to reflect the update. Our system updates the story "slug" when the headline changes—check the URL of this story, and you'll see words from the headline in the URL: /this-is-what-happens-when-publishers-invest-in-long-stories. But the number preceding the slug—on this article, it's 3009577—is a unique node ID which never changes. So essentially, every time we update an article, we get a fresh URL with a fresh headline, but pointing back to the same (newly updated) article. So, it's like having many URLs and many headlines which lead back to the same big, multi-faceted article. We called these "Tracking" stories.
Before we dive into the results of this little experiment, I'll explain the origin of the hypothesis. Our top editors had long felt that the discrete article format was insufficient for covering really big, unwieldy topics like the death of the file system, or the frustrating lack of women in software, or how to think like an engineer. When we launched Co.Labs, it was the natural proving grounds for the concept. We hoped the "slow live blog" approach would give us more flexibility and speed when it came to writing and producing news. Instead of starting with a fresh article every time we want to cover something inside a regular beat, which might require a long catch-up introduction, context, background and so forth, we could just put fresh news at the top and let the reader scroll down to read previous updates if they hadn't been following this story.
The stub theory accounted for handling shorter news posts, but how did longer reported pieces fit in? Our strategy was to still produce feature stories as discrete articles, but then to tie them back to the stub article with lots of prominent links, again taking advantage of the storyline and context we had built up there, making our feature stories sharper and less full of catch-up material. We use big headings to call out the connection, like so:
Our interview with a former Frog Design strategist about what's wrong with today's wearable computing devices, for example, branched off the original stub where people can read back through news around the topic and ultimately find out why we decided that interview was apropos. It's not just reader-friendly to provide easy access to this sort of context, but it's also transparent—it gives the reader insight into why we cover the stories we do, and why they're timely.
Soon we realized there were all sorts of opportunities to try this format. Other stubs we started: how to price your software product; the slow growth of the Internet of things; the rise of bitcoin as a legitimate currency; the hype around big data; and even broader topics like the future of the user interface. We even thought the stub format might be useful for more introspective topics like how software culture is changing news.
We prepared about a half-dozen of these stubs in draft, and scheduled them to go live the second week of April. As soon as we did, there were drastic, marked changes in our analytics. The first thing that caught my eye was a big drop in daily bounce rate:
Encouraged, we switched to the hourly view, where the change was even more marked:
Whoa! Next we checked out the average visit duration, which reflected similar changes:
Using Google Analytics customizable graphs, we were able to plot visit duration against pages per visit, which stayed stable even as duration jumped:
For fun, we also plotted bounce rate against average time on page, on a daily basis:
And finally, this screenshot tells the full story. Bounce rate vs. average visit duration actually inverted like an X, and has stayed stable since:
Big-time disclaimer here—it's too early to tell how permanent these effects will be, and we can't know for sure that the changes are attributable to these stub articles. But we've racked our brains to think of other factors at work here—some big boost in inbound links? Some external event? A technical change? But after about a month, we've seen these changes stabilize, and we haven't been able to isolate any other contributing factors. We're not saying this is causation, because there's no way to be sure. But it sure as hell looks like it's working.
Last year, on a trip to the Bay Area, I had a conversation on background with an entrepreneur and angel investor about the future of the content business. He was bearish. Insisting that people didn't want to read long stuff online, he told me that we'd be wiser to find ways to express ourselves in as few words as possible—going so far as to suggest that, someday, we would all communicate primarily via emoji-like symbols. (I'm not kidding.) Incredulous, I limped back to New York worried I might be working in a terminally ill business.
But some of the brighter minds in media don't think that's the case. In fact, we're not the only organization betting on long form quality. Here's the CEO of Vox Media Jim Bankoff talking at TechCrunch Disrupt on May 2, 2013 (emphasis mine):
We know somethings as a fact. Globally there is a $250 billion advertising market of which 70 percent is really built on brand building… the top of the funnel, to use the marketing jargon. If you look at the web, which is a $25 billion slice of that pie, 80 percent of it is direct response—it's search… it's bottom of the funnel stuff. So there's a big market opportunity there that hasn't been captured. Where is all the brand building going [...] that we had seen previously in magazines and newspapers and even in broadcast going to go, as consumers turn their attention to digital media? We believe there's a big opportunity there, but someone has to actually go after it—someone has to bring the quality back.
These stories have presented us with all sorts of technical challenges. Over the last month, as these stories accreted updates, they've gotten so long that they actually overran our server cache by a factor of 10 and took down our site for a short time; the cache was built to handle articles around 10 kilobytes, not 100 kilobytes.
Another issue is that our CMS interface, on the backend, isn't really meant to handle this sort of story production. We have to interlink our features and stubs manually, which drives up production time (and therefore cost). There is also much more opportunity for lots of human error, because updating a stub means manually inserting new text at the top of the body text, and cutting/pasting older updates "below the fold" like so:
It's a tedious process for our web producer, and we're working on ways to alter the article format and the composer UI to be more hospitable to this sort of reporting. Ultimately, automation will cure all these small ills, and we'll be (hopefully) left with just the upside.
It's been about nine months since we did this experiment, and I still get questions about what we ultimately learned from it. I'll try to summarize the lessons here, but feel free to ask me follow-up questions on Twitter @chrisdannen.
Overall, we garnered a lot from this experiment—mostly in the way of instructive mistakes. But we also learned something about how to incentivize our writers, and how readers interpret "signals" for quality, and while we don't do these Tracking posts anymore, parts of their DNA do live on. Still, I'll outline this summary by mistakes, since that feels most logical.
Our alleged drop in bounce rate was an erroneous effect of an event-tracking adjustment one of our developers made, coincidentally the same week we started this experiment. Naively, I had not thought to ask the developers when I planned this experiment if there were any tickets in their queue that related to analytics that happened to coincide with our new experiment. Lesson learned: always ask the dev team what they're working on when doing an experiment.
Our rise in engagement was real, but I made another big mistake here. We were still a fairly new site, and as I was launching this experiment, I was also hiring new writers. I hired them as I found them, rolling admissions-style.
But that meant that in the background of our Tracking story experiment, there was a slow but steady rise in high quality sub-2000-word pieces which were helping buttress engagement. So while we can assume that some of our increase in visits and engagement came from the Tracking posts, some of it may also be owed to a general rising tide of coverage (both quality and quantity) on the site.
Author motivation was a big part of the impetus for these stories. Contributing to a Tracking story required writers to keep tight focus on certain stories, which ended up being advantageous when breaks in that story begged for new feature articles. The writers didn't have to "catch up" on a story or validate the pitch nearly as much when they had been maintaining a Tracking post on a given topic.
In practice, we could get a lot of the same benefits by freeing up writers from other busy work, so they can spend more time on Twitter, Reddit and HackerNews figuring out what people are going to be talking about tomorrow. I like the idea of giving writers license to be experts in a certain story, but I'd like to find a way to incentivize them towards this goal without the busy work of doing these Tracking posts.
"Quality versus length" was a topic some commenters broached after reading this piece. I actually think this discussion is moot, because as the top editor of a site, my "quality" heuristics don't change (indeed, probably can't change) from writer to writer or post to post. In fact, the sole job of a top editor (if any) is to do overall quality assurance on each story that publishes.
That said, Tracking stories do exploit a mistaken correlation in our readers' brains: namely that length implies quality. More than anything, I was interested in Tracking stories as an indicator of depth and expertise—quality by another name. Since we varied the formatting of these stories, this "quality" signal was only successful some of the time.
When we formatted stories to with dated, reverse chronological entries, they felt more like long live-blogs, which don't imply quality. More substantive entries with less exact dates ("week of" instead of "day and time") give these stories more of a mini-blog feeling, which looks (to me) to suggest a higher quality of edit.
When we set out on this experiment, we should have created some KPIs to determine whether these stories were a success. But more accurately, we should have focused on another question: if these stories are successful, for whom do they work?
In other words: which writers are most efficient at bringing in visitors for every given word they "spend" on the page? When you're setting out to do "long" posts, the goal is obviously to make the added words and depth return more visits. Puzzling over the results of this experiment has led me to experimenting with cost per visit, cost per word, visits per word and other author metrics that may reveal who our winningest writers really are—regardless of sheer pageview counts.
Unless you assign them strictly to one or two writers, Tracking stories quickly begin to suffer from a "cooks in the kitchen" effect. Since the topics we chose were often big and broad enough to engage the interest of several writers, writers frequently shared ownership of them, drifting in and out with the occasional update. That balkanized the voice and required an extra step every time I found a piece of relevant news to contribute; which writer should grab this hot potato?
Tracking our experiments in one central location is a project I haven't yet figured out how to execute. We run two or three publishing experiments at any given time, and I still don't know where they should live. People following these experiments should have a centralized dashboard to see ongoing data or results of past experiments, but building something like that from scratch is too time- and cost-intensive. If anyone has any suggestions, let me know.
[Bouncing Tennis Ball: MarcelClemens via Shutterstock]