Why Startup Founders Shouldn’t Lose (Too Much) Sleep Over Scaling Problems

Reliability is overrated. Focus instead on utility, speed, and aesthetic appeal.

Why Startup Founders Shouldn’t Lose (Too Much) Sleep Over Scaling Problems
[Image: Flickr user Benson Kua]

Let’s say you run mobile app startup that people love–millions of users, spectacular growth. Then, for one unfortunate weekend, your service goes offline for a couple of days. Your engineers do their best to make sure it never happens again, but you’re growing fast. Is your company doomed?


That was the scenario faced by several prominent startups earlier this month, prompting a column that appeared in Wired that argued the outages could have been perilous:

“On one level, a dust-up like this is just part of life as a startup. Things go wrong, people get upset, problems are solved, lessons are learned. But the stakes are higher when you’re Dropbox–or any other tech startup that has ascended to the misty heights of the billion-dollar club. This weekend’s Dropbox outage, along with recent problems for Uber and Snapchat, show just how close such companies skate to complete disaster–not because of anything they necessarily did wrong, but because of the very nature of their businesses.”

The argument goes: When you’re dependent on doing one thing well, ceasing to do that one thing well even for a short period will sow mistrust in your users. Should your servers go down at the wrong time, fickle customers will flee en masse to another service that also does that one thing. The cardinal sin is being unreliable.

But let’s look at all the prominent technology companies who had problems with reliability for years, and users stuck through them.

Twitter is the obvious archetype here. A recent feature over at Time chronicles the company’s history of downtime:

In 2007, when Twitter first came to prominence at the South by Southwest interactive festival, the social network became notorious for its downtime. Between server overload and scheduled maintenance, the site was offline for almost six days total that year, according to Pingdom, a service that tracks website performance. Things were not much better in 2008, when the site crashed during Steve Jobs’s keynote at MacWorld in January. Cofounder Biz Stone described the site as being in constant “emergency maintenance mode” at the time. By May the quickly growing site had created a standalone blog just to tell users when Twitter was down.

Remember how popular the Fail Whale was? Incidentally, Twitter actually went down while I was researching this article.

Just a few others:

  • Year after year, the tech world heaps praise upon Evernote. It’s growing and it seems to have a great internal culture. But user forums are rife with complaints about poor sync performance on almost every platform that Evernote has an app–which is all of them. Still Evernote remains popular, most likely due to a responsive support team and the fact that the service becomes more useful the more wrapped up in it you become.
  • When Apple Maps first launched in the Fall of 2012, it was immediately panned, and for good reason: It was useless at finding places. Over a year later, the application hasn’t shaken it’s reputation for being unreliable, but it’s managed to justify its continued existence thanks to its useful developer tools.
  • While Rdio’s apps are elegantly designed, users frequently find them to be a buggy mess. For a long time, its Windows Phone app hardly worked, and even desktop clients sometimes struggle. Even in their complaints, users remain fond of the service, though. Perhaps being more aesthetically pleasing than the competition has its benefits.
  • Box is well-regarded for being a secure alternative to Dropbox or Google Drive in spite of a persistent issue that keeps it from working for lots of high-volume users. The user forums are absolutely full of instances of issues regarding user permissions, sometimes even preventing them from syncing properly at all.

When something is valuable to lots of people, and it benefits from the network effect, this increases user lock-in in spite of things like outages. Uber, Dropbox, and Snapchat all qualify. That’s why reliability it’s the most important thing for consumer services. Enterprise, however–that’s a different game.