It was the worm that turned Microsoft. In early July 2001, a malicious piece of computer code squirmed into a Web server running on Microsoft’s Internet Information Server software. It quickly propagated across the Internet and into at least a quarter million other servers, knocking an untold number of Web sites offline. The computer worm, dubbed Code Red, slowed Internet traffic to a crawl and cost companies billions in fixes and lost productivity. Some computer-security analysts called it the most damaging Web-server virus ever. As it turned out, its symptoms were fleeting; within a month, the fever had broken. But because it had exploited a known flaw in Microsoft’s Internet server program, Code Red caused the Redmond, Washington, giant to turn red with embarrassment, especially when it was revealed that Microsoft’s own MSN Hotmail servers were also infected.
Code Red proved to be a defining moment for Microsoft. Coupled with other quality concerns and computer viruses, it prompted some industry analysts to renew their periodic warnings against using newly released (and therefore risky) Microsoft products. Those warnings came at a time when Microsoft was going head to head with IBM and Oracle in the rich server-software market for large corporations, where security and reliability are deal- breaking priorities. “Customers were concerned whether Microsoft products could be trusted to be available whenever they needed them,” says Michael Cherry of Directions on Microsoft, a firm that analyzes the company’s products and strategies. “These trust- related issues threatened to damage Microsoft’s OS and Office business, hamper the company’s drive to increase server-product sales, and impede its expansion into consumer services.”
The worries were crystallized in a January 2002 memo that chairman and chief software architect Bill Gates emailed to Microsoft’s 50,000 employees. Gates declared that a companywide initiative, Trustworthy Computing, would unfold over the next 10 years. The goal: to make computing as reliable, dependable, and secure as a phone’s dial tone. The days of rushing out imperfect software were over. Cool features and tight deadlines would no longer drive product rollouts; quality and security were now the top priorities. “Flaws in a single Microsoft product . . . not only affect the quality of our platform and services overall, but also our customers’ view of us as a company,” Gates wrote. “We can and must do better.”
Better late than never, Microsoft is discovering that we are living in a quality economy, where companies are pouring vast amounts of craftsmanship into their wares. As it seeks to sign up large corporations for its server software and shake the bad publicity that comes with chronic attacks from computer viruses, Microsoft has concluded that its decades-long practice of putting out “good enough” software is no longer good enough.
This fall, Microsoft is rolling out Exchange Server 2003 and Office 2003. The new and supposedly improved versions of two of its biggest workhorse products will be an acid test of whether Gates can make good on his vow to develop dependable code — and of whether his quest for quality is for real. “The joke inside Microsoft has been that quality is job 1.1 — the real bugs don’t get fixed until after the release, when the service patches come out,” says Greg DeMichillie, a Directions on Microsoft analyst who spent nine years at the software maker. “It’s not yet clear that quality really is a top priority across all of Microsoft’s lines of business.”
The results so far are indeed mixed. This past January, Microsoft suffered a major setback in its quality effort when Slammer, another pernicious computer worm, attacked a vulnerability in Microsoft’s SQL server database software and spread through network connections, crashing more than 100,000 databases around the world and shutting down at least 15,000 ATMs in the United States. And in mid-August, thousands of computers were infected by the Blaster worm, which exploited a hole in Windows operating systems. To skeptics, Slammer and Blaster prove that Gates’s talk of improving the security and stability of Microsoft’s products amounts to little more than a sales pitch.
Not long ago, I traveled to Redmond to see for myself. I met with CIO Rick Devenuti and key leaders on the Exchange and Office teams to get a closeup view of how they’re grappling with Gates’s new mandate. I found that Microsoft’s quality effort is for real, and that it rests on four simple propositions. But be forewarned: This initiative is still very much in development. Call it quality, first release. Patches will arrive over the next decade.
The Process Is in the Plan
For most of its history, Microsoft wanted to be the anti-IBM — a nimble, creative place where programmers could have a neat idea one day and get it into the product the next, without cumbersome approvals and documentation. But there was a problem with that loose-limbed approach. Says DeMichillie: “As its products grew more complicated, Microsoft discovered that even though IBM had its failings, its processes were there for a reason — to ensure predictability, stability, and quality.”
Microsoft’s development teams are now adopting some of the IBM-like processes that they once abhorred. Case in point: the product blueprint — detailed design documents that programmers now work from. The blueprint for Office 2003 consists of roughly 4,000 product specs, each of which amounts to a 30-to-50-page document that describes in rigorous detail a spec’s features, how it works, and how it must be tested. In theory at least, such rock-solid product design results in cleaner code and fewer bugs. “The blueprint keeps us from getting random,” says Eric Levine, group program manager and an 11-year veteran of the Office team. “We create off of it, we code off of it, we cool down, and we test. That’s the process.”
But it’s still unclear whether developers rigorously stick to the plan. On the one hand, in the latter stages of Exchange’s development, programmers must file a change request when they want to rejigger the code, and managers meet daily in a war room to make their calls on the requests — just as at IBM. A deep database underlies Exchange’s staggering 6 million lines of code, and records every single alteration — just as at IBM. But programmers are still encouraged to get creative and take risks. “We give a lot of freedom to our developers for a reason — they’re smart people,” says Betsy Speare, the release manager for Exchange. “We’re willing to make a mistake and catch it, so long as we don’t miss an opportunity.”
In other words, Microsoft is trying to be both rigorous and nimble. “They want to be able to react quickly to competitors and market changes,” says DeMichillie. “So they’re trying to have it both ways. Can they succeed? No one has before.”
Make Your Company Your Customer
Given his combination of business smarts and deep technical knowledge, Bill Gates is uniquely positioned to run a hard-hitting product critique. And he does so in something called a Bill Review — a two-to-three-hour meeting with the leaders of a product team in which he weighs in on the design, features, and strategy for an upcoming release. Still, the agenda for a Bill Review is at least partially driven by what the team expects will catch his interest, such as the overall product architecture. The blocking and tackling of a launch — testing and tracking bugs — have never been high on Gates’s radar, and a Bill Review can miss potential traps, such as the flaws that were ultimately exploited by Code Red. But his review of Exchange 2003 was different; Gates hammered away at the need to ensure quality and security. As Speare walked out of the meeting, she recalled that Gates’s challenge to the team seemed to hang in the air: “Make it work, and make it great.”
It’s doubtful that anyone takes that mission more seriously than Devenuti, a quietly intense Microsoft lifer. As the boss of Microsoft’s own worldwide IT operations, Devenuti must deal with some of the most demanding, technoliterate people on the planet. IT isn’t Devenuti’s only worry. He also heads up the 3,000-person operations and technology group, which is the first to set up, launch, and weigh in on a new product before it’s shipped to the outside world. All of which makes Devenuti Microsoft’s first adopter — and at times, the most unpopular man in Redmond.
Here’s why: When Exchange 2003, code-named Titanium and still in beta, rolled out of the development and testing lab in September 2002, it rolled right into Devenuti’s shop. His team put the not-ready-for-prime-time software up on a group of servers that were isolated, or “forested,” from the rest of Microsoft’s global network. The wobbly code crashed servers, temporarily shutting off many users’ email. At Microsoft, as elsewhere, tempers go up when servers go down. “People start screaming, ‘What the hell’s wrong with you guys?’ ” says Devenuti.
When new code runs amok, a beta version is pushed back to the lab, where its bugs are zapped, and it’s then redeployed to an ever widening number of servers. The process is called “dog fooding” — as in eating your own dog food. Before Exchange is shipped to more than 100 million users, Microsoft will use it on itself. Says Devenuti: “Microsoft’s employees are Microsoft’s first and best customers.”
Other high-tech companies dog-food their wares, but it’s un-likely that any company does it on Microsoft’s scale. “Right now, I’ve got 50,000 people using the next version of Office, which lets us say to customers that we’ve deployed it and we’ve lived on it,” Devenuti says. “Dog-fooding gives the product a level of proof and validity that testing alone could never attain.”
Dog-fooding isn’t a perfect solution, however. After all, the dog food is consumed in a homogenous environment at Microsoft. Each piece of software is up-to-date and compatible with every other piece of software, rarely the case in the real world. And when Exchange 2003 crashes one of Microsoft’s mail servers, Devenuti can summon an Exchange developer to debug it. You can’t.
“It’s not yet clear that quality really is a top priority across all of Microsoft’s lines of business.”
Triage Your Failures
Do we ship, or do we slip? It’s a question that preys on every product manager as the coding approaches its final months of development and testing and the do-or-die deadline looms large. What causes a product release to slip? In a word, bugs. It’s simple probability: Bugs will inevitably infest a product like Exchange, with its 6 million lines of code. During the final stretch, when Exchange 2003 was being readied for shipment to manufacturers and was supposedly at its most stable, testers were fixing roughly 500 bugs a month. DeMichillie reports that during a product’s three-year development cycle, “tens of thousands” of bugs are typically discovered in testing and deployment.
Not all bugs are created equal. When a bug is discovered, it is dropped into a database, called Product Studio (previously known by the more colorful moniker “RAID”) that enables program managers to track it and ensure that it’s routed back to the appropriate developer. Once it’s entered into the database, a bug is sorted and evaluated. Bugs that have a high likelihood of crashing the product are given the highest priority rating.
The goal is to zap every single “priority one” bug and kill off as many low-priority bugs as possible. But the bug count will never hit zero. Not every bug that’s found can be fixed because there’s not enough time. “As the ship date looms, you’ve got to stop making changes and let the product settle down,” says DeMichillie.
“The pressure builds to postpone bugs. You start to tell yourself, ‘This bug’s not that bad. We’ll get it the next time.’ “
The Data Does the Talking
On June 30, the 500 members of the Exchange development team threw themselves a party in the parking lot behind Building 34 on the Redmond campus. Exchange 2003 was out of the building and on its way to the manufacturers, the culmination of a three-year effort. Gates dropped by and thanked people for doing a great job; the pervasive feeling was one of exhilaration, not exhaustion. “We can say with tremendous confidence that this is the highest-quality release of Exchange to date,” says Missy Stern, a product manager for the Exchange marketing team. “The reliability of this product is unparalleled. We’re feeling extremely good about it.”
Stern is staking her claim on the data. Microsoft had over 50 partner companies run Exchange 2003, deploying the software on 170,000 workstations over five-week periods. Within Microsoft, Exchange nailed the “three nines” before it was shipped: 99.95% of uptime for six weeks, running on 100% of Microsoft’s mail servers. The company ran weekly surveys of its dog fooders, the first time it had polled users during development. The operations and technology group continued testing right up until the code was released to manufacturing. As far as Microsoft is concerned, Exchange 2003 has passed its quality test.
Or has it? Just 16 days after the release party, Microsoft issued a patch for a critical flaw that was discovered in Exchange 2003. Windows XP, 2000, and NT 4 were also affected, which led analysts to conclude that the squirrelly code went undetected for years, despite the most intensive round of testing in Microsoft’s history. Then came mid-August’s Blaster, another computer pox that attacked a different vulnerability in Windows and proceeded to slow worldwide Internet traffic. When security sleuths dug into the code, they discovered a taunt: “billy gates why do you make this possible? Stop making money and fix your software!”
Taken together, the flaws show that “Microsoft’s marketing claims might be setting up false expectations in people,” says DeMichillie. “There’s no silver bullet for eliminating every mistake, and Microsoft’s developers would be the first to admit this.” Still, there are signs that Microsoft’s drive for quality, even at the expense of deadlines, is in earnest. In late April, the company said it would delay the launch of Office 2003, originally set for this past summer. It now comes out October 21. The reason: The folks in Redmond had taken an extra three months to finish testing the software with users.
Bill Breen (firstname.lastname@example.org) is a Fast Company senior writer.