The war over our eyeballs is heating up, and that’s mostly a good thing for people who like to look at screens. Just as Meerkat, Periscope, and Snapchat elbow each other in a battle over appointment viewing, and digitally native companies like Netflix and Hulu duke it out over the future of TV and movies, an old-school competitor shows up and throws down a shiny new gauntlet: Last month, HBO Now launched just in time for cord-cutters to catch the debut of the new season of Game of Thrones. There’s never before been such an abundance of quality, readily accessible television fighting for our attention.
YouTube, however, has found itself in this new golden age of video with a serious handicap: While its overall viewership was growing, most of that growth was happening across the web at large, outside of its own site and apps. That made it harder to capture eyes and ad dollars, and to appeal to cherished mobile users. When Susan Wojcicki became CEO last year, she would continue efforts to transform the way it served ads and engaged with its creators, pouring money into certain channels and promoting them heavily.
But there was a more pressing, underlying issue for the video giant: The way it built and tested its code was rusty. The outdated pipes running beneath the world’s largest video site were making it trickier to build the kind of new features that could keep users watching more, rather than browsing over to, say, Netflix or Hulu.
“As a business, it’s kind of a vulnerable situation to be in,” says Cristos Goodrow, a director of engineering at YouTube. “We’re not building as strong of a relationship with viewers. We have very little leverage to try to make the experience better for users.”
So in 2012, the company undertook a massive, cross-department initiative to fix those problems. Code-named InnerTube, the project, which Google has not previously discussed, would tackle everything from its development platform to its machine learning algorithms, a retooling that would enable engineers and designers to more quickly test and craft a more engaging, addictive experience on more screens. “This is a change that we had to make if YouTube was going to continue as an important thing on the Internet,” says Goodrow of the ongoing InnerTube project. “We had to become a destination.”
It all began with an infographic. In 2012, Kerry Rodden, a user experience researcher at the company, created a data visualization breaking down YouTube viewership activity that soon started to show up in meetings throughout the company’s headquarters. It showed that rather than turning to YouTube.com or one of the service’s many apps on phones, TVs, and tablets, most of us were watching YouTube videos in other places: in blog posts, news articles, links on Facebook and Twitter and the like: All of the millions of YouTube embeds and links woven into the web. Having its user base fractured across the web meant that YouTube’s ability to retain, learn about, and monetize users would be limited.
The importance of YouTube’s ability to earn revenue from its users can hardly be overstated. While it boasts over 1 billion monthly viewers, the still-unprofitable site raked in a mere $4 billion in revenue last year, according to The Wall Street Journal. Compare to that Netflix’s $5.5 billion of 2014 revenue on only 50 million subscribers and it becomes easier to understand why corralling its users into the same place became such a big priority.
Fixing this problem would also help YouTube future-proof itself against a growing list of competitors. With the launch of YouTube Music Key, the site is officially a competitor to Spotify, who is reportedly plotting to move further into YouTube’s territory by adding video to its music streaming service.
The project would begin by piecing YouTube itself back together. The proliferation of new devices–and accompanying demand for YouTube on all those screens–had advanced more quickly than the company’s software development infrastructure could keep up with. As a result, YouTube’s development process became fractured: Its XBox app was built on one track, while its iOS app was built on another, and so on.
“Over time, the friction of having these different systems and ways of development was starting to slow us down,” says YouTube engineering director Andrew Berkheimer, referring to the disjointed developer platforms that had Frankensteined their way into existence at YouTube. “So we recognized that was going to be a big problem, especially as mobile is going to become the dominant part of what we do.”
Even worse—and the team seems embarrassed to admit this now: YouTube was unable to collect detailed analytics from user behavior on mobile devices. That meant that if I lay in bed with my phone all night binging on Louis CK clips and chuckling to myself in the darkness, the desktop version of YouTube would have had no idea.
“It’s crazy that we weren’t using information from mobile,” admits Goodrow. “When we finally started doing it, it reinforced the fact that there are a huge number of users who are mobile-only,” he says. As the amount of time U.S. adults spend watching video on their phones continues to increase, a blind spot like this can become a serious competitive liability.
Indeed, by 2012 a majority of YouTube’s viewership was happening on mobile devices. And although its engineers had assembled a fleet of apps for every imaginable screen, the rotting development infrastructure and fragmented process had made it hard to do simple things on mobile, like run A/B tests on users to study new design elements and functions, and to generally measure user behavior.
The new system is designed to undo those technical hurdles. “InnerTube will let us test much more significant changes to the the UI in a controlled way,” says Goodrow, who hints at major enhancements that are “in the works.”
Even without dated infrastructure, mobile development has one major disadvantage compared to building for the web: It’s slow.
“Every new version of the app took six weeks, which led to a very long iteration cycle and learning cycle,” says Berkheimer of the old system. This meant that if a new feature wasn’t performing as expected, it would be weeks before they could swap it out. Similarly, any new design ideas and functionality were trapped on that painfully drawn-out development cycle, a headache for engineers used to pushing products to the web, where changes are instantaneous and iteration is constant.
This is another thing InnerTube aimed to fix. Thanks to a new, more flexible API, engineers can now push changes across platforms more easily than before, and make certain updates to apps on-the-fly without needing to resubmit the app each time. This system also allows changes to be easily undone, which lets developers experiment more freely without fear of having a busted app sitting in an app store for days or even weeks.
“In the old world, you had to be very cautious about what you put out there, because you couldn’t turn it off,” says Berkheimer. “Now we can be bolder about what we put out there, because we have ways to turn it off if we need to. And if it does work, we can just as easily turn it on for everyone as well.”
From tiny tweaks to major app updates, the mobile development process at YouTube is now much smoother and more efficient than before. Indeed, post-InnerTube, that six-week lag has disappeared and YouTube engineers are able to push new versions of its apps into production within the span of one week.
It’s also easier to build new things now. The YouTube Kids app launched in late February, for instance, was considerably easier to build once the foundational groundwork was relaid by the InnerTube project.
YouTube Kids takes aim directly at Netflix’s own efforts to capture the attention spans of younger audiences, a key part of carving out future marketshare in streaming video. This is something Netflix started focusing on in earnest nearly four years ago. With its development infrastructure freshly rehabbed, YouTube hopes it will be able to respond to competitors a bit more nimbly.
Why does it matter that I love to watch Louis CK’s bit on getting freaked out when he’s high? Details like these are a crucial piece of the YouTube user experience puzzle: Recommendations.
As part of the InnerTube project, YouTube’s engineers totally overhauled the way its video recommendation engine works. It may seem like a minor detail to the end user, but there’s a lot of technical plumbing behind those little boxes under the “Recommended” heading that’s now so prominent on YouTube’s homepage. The relevance of these videos has enormous value for the company: The better the recommendations, the more addictive the experience. The more time you spend going down video rabbit holes, the more revenue for YouTube and its creators. (It’s also less time you could spending staring into the app of a competitor.)
To perfect these recommendations, YouTube tapped into the Google Deep Learning Project, formerly known as Google Brain. This is the artificial intelligence project that focuses on deep learning and neural networks, an area in which Google (like Facebook) has shown an intense and growing interest (most recently by acquiring an artificial intelligence startup called DeepMind). One day, the Google Deep Learning Project—one of the things that futurist Ray Kurzweil is working on at Google—might replicate or outdo the complexity of the human brain. In the meantime, its neural networks are ideal for things like speech recognition—or finding new videos to watch on YouTube.
“We’re using neural networks that are ten times larger than the neural networks I remember working on 20 years ago,” says Goodrow. “They have thousands of nodes and we’ve trained them on trillions of observations. Because of that, we can have systems that better learn the kinds of things that you’re interested in.”
In this case, the input data is the YouTube activity of millions of users—not just which videos we all watch, but details like how long we watch them, which ones we favorite, and which ones we skip past. It’s a bit like the Amazon-style, collaboratively filtered recommendation engine (“if you liked that, you’ll like this”), but supercharged with artificial intelligence. They also added a button that lets users dismiss recommendations, something that, incredibly, wasn’t there to begin with.
The new model is a huge improvement on the old, regression-based algorithm YouTube used for recommendations, Goodrow explains. While that method was good at memorizing relationships between videos, it would struggle in unfamiliar situations. The neural network approach is much better at inferring things about videos that are new, or about video-to-video relationships that otherwise lack historical data. “It does a much better job of predicting what’s important about this user and their watch history that might say they’re going to like this new video,” says Goodrow.
Surprisingly, the YouTube recommendation algorithm doesn’t draw inputs from far beyond the confines of YouTube itself. You might think that mining our Google search histories for clues about what videos we’d like would pay off. Nope, Goodrow says.
“The challenge is that web search history is very very broad.” Just because you Googled for help with your taxes does’t mean you want to watch YouTube videos about the ins and outs of U.S. tax law.
To further beef up video discovery, YouTube’s engineers gutted the site’s search functionality and replaced its proprietary video search engine with the same technology that powers Google search. After running some A/B experiments—again, thanks to refinements enabled by InnerTube—the team was able to confirm that, indeed, Google’s core search algorithm was more effective than what YouTube’s (much smaller) team had cobbled together over the years. “It was like magic,” says Goodrow.
All of the things that InnerTube has enabled—faster iteration, improved user testing, mobile user analytics, smarter recommendations, and more robust search—have paid off in a big way. As of early 2015, YouTube was finally becoming a destination: On mobile, 80% of YouTube sessions currently originate from within YouTube itself. Even on the desktop, where users have been sprinkling YouTube embeds for more than a decade, 55% of views are happening on YouTube.com. By early 2015, YouTube says it’s seeing 50% year-over-year growth in total viewership. And a majority of those views now originate from within YouTube’s own apps or website.
As eager as the engineering team is to flex their muscles, technology isn’t the only factor at play here. YouTube has been investing millions in original content for years and even if the service has yet to see Netflix-sized success on that front, it’s clearly driving at least some of the new traffic to YouTube. The company has also been spending big ad dollars offline. Anyone who’s taken the New York City subway in the last year has seen the display ads for YouTube’s official partner channels—brands like Vice News and “Famous Rap Battles Of History”—plastered everywhere. Or perhaps you’ve seen the billboards in other markets. Or the TV commercial during the World Cup. And in an effort to grow revenue and compete with subscription-based services, YouTube indicated last month it would be launching an ad-free subscription service at some point this year.
Still, however impactful these huge, non-technology-related investments may be, they’d only go so far if the user experience itself wasn’t as polished—and easily adapted—as possible. Now that more than half of YouTube’s rising view counts are happening within the company’s own properties, the site is better positioned to entice people further down those late-night video rabbit holes. Or grab their attention with a new web series they didn’t even know about but that algorithms figure they’d dig.
That isn’t to say that it’s all smooth sailing from here. Have you ever heard people at work chat about a YouTube channel the way they obsess over The Jinx, Game of Thrones, or House of Cards? Better code under the hood won’t change that overnight, but it could help YouTube figure out sooner how to get there.