Like most of us, John M. Lervik has a favorite number — but his isn't anything as simple as 3, 7, or 9. From his harborside office in Oslo, Norway, Lervik admits a fondness for the figure 2,112,188,990.
It's not a random number: It's the colossal number of Web pages currently indexed by his search-engine company, FAST (Fast Search & Transfer). It's big enough to nudge FAST ahead of Google's 2,073,418,204, earning FAST the title of the Web's most comprehensive search service. And, if Lervik's devotion to numbers continues, he hopes to use a formula of bigger, fresher, and closer to put FAST ahead of its better-known search-engine rivals.
In the era of the post - dotcom meltdown, it's fashionable to downplay the Web's role in the economy. But with little fanfare, search engines continue to flourish as an online hot spot for several key reasons. First, information proliferation: More data has been produced in the past two years than in the previous 2,000 — and the race to catalog it poses an extraordinarily complex, significant, and ultimately rewarding mathematical and linguistic challenge. Second, Web habit: For most daily users of the Web, relying on a search engine as a first step in navigation has become a matter of routine. Third, advertising: Advertisers have clicked to the idea of flashing their brands in ads that accompany relevant search results.
The upshot is a fast-paced race to the future among the best-known search engines. If counting eyeballs is the metric that matters, Google holds the pole position. Its successful combination of powerful technology and common-sense usage pulls in 58 million users a month. Nearly half of all searches globally go through Google, which, according to some sources, has the strongest brand on the Web. And Google enjoys almost fanatical support from its users — all of which should make its lead unassailable. But coming up on the blind side is Lervik's low-profile, aerodynamic Norwegian machine, with a supercharged search engine and a well-tuned business model that may just leave Google in the dust.
"While others have been building pretty cars with shiny wheels and nice paintwork, we've been building the strongest engine," says Lervik, CEO of the $36 million firm he cofounded in 1997 with fellow research students at the Norwegian University of Science and Technology — Norway's version of MIT. It bothers Lervik little that most users have never heard of Alltheweb.com, the not-so-public face of FAST. In fact, letting others put their stickers over his brand is part of Lervik's distinctive strategy: By forging behind-the-scenes partnerships with some of the biggest online players, such as Dell, IBM, and Lycos, Lervik hopes to win more users than Google, without having them know that they're a FAST user.
The FAST Formula Lervik's formula for overtaking Google is a simple three-part equation: bigger plus fresher plus closer. First, bigger. Simply put, when it comes to search engines, size matters. Lervik is determined to make FAST the Web's biggest engine — which is why FAST searches pages in 49 languages and 225 different file formats, including more than 180 million multimedia files, 132 million FTP files, and 2 million MP3s.
Second, fresher. One of the biggest frustrations of Web searchers is stale or dead-end results. Lervik's team completely refreshes FAST's index every 7 to 11 days — which makes its results much fresher than Google searches, which operate on a 28-to-30-day cycle.
Third, closer. The real test of a search engine: How closely can it deduce the real intent of a user's query? When it comes to relevance, Lervik acknowledges, FAST trails Google — and that is one area where he's been pouring in resources. For the past 18 months, FAST researchers have been working on what Lervik calls "the third generation of search."
Here's what's at stake: Take the query "Where is Saturn?" A first-generation search engine looks only for pages that contain the word "Saturn." A second-generation search engine culls documents linked from pages about Saturn.
To perform a third-generation search, says Lervik, means getting closer to the user's real intention — applying rules of grammar, syntax, and semantics to computer linguistics. "It's a morphological challenge," Lervik says, "to understand that words can be written in many different forms." About one in 10 queries, for example, is misspelled, so FAST's latest algorithms consider whether the spelling of "Saturn" is correct. Query analysis recognizes "Where is" as a request for a location, while dynamic clustering groups results together based on whether the user wants information on Saturn the planet or Saturn the car.
The Unbranded Brand Alltheweb.com is already a hit among scientists, librarians, and researchers. And it gives FAST a working laboratory where Lervik and his team can test-drive new tools and techniques. But when it comes to building the Alltheweb.com brand, Lervik is clear about his company's course. "Google is trying to leverage its brand and become a portal," he says. "We are so technology focused that even if we tried to be consumer focused, we couldn't be."
The difference for FAST is in its business model. While rivals like Google remain reliant on advertising sales, FAST harvests revenue in two ways: by powering Internet search services on behalf of portals like Lycos, which pays FAST a fee for each query served, and by selling software licenses to enterprise customers such as Dell, Ericsson, FirstGov, IBM, Reed Elsevier, and Reuters, which want FAST search technology for their intranets and e-commerce sites. According to Lervik, Alltheweb.com is deliberately starved of PR so that it will not compete with these partners. In fact, working closely with its partners brings FAST other benefits. "Around half of our product road map comes from working with partners," says Lervik.
FAST Recovery FAST's rapid rise is all the more remarkable given that just a couple of years ago, the company nearly ran off the road, following an ill-fated stab at a Nasdaq listing. When FAST realized that a 2000 IPO was futile and that the company had expanded into too many noncore areas, one-third of FAST's 270 then-employees lost their jobs — including the CEO who had been recruited to handle the listing. Lervik, who had been FAST's chief technology officer, moved into the role of CEO. Among his key decisions: calling a halt to FAST's peripheral activities in bioinformatics and image compression. Alliances with some of Europe's biggest portals — Freeserve, Tiscali, and T-Online — have propelled FAST to leadership as the continent's number-one search engine. But to challenge Google for overall leadership, Lervik says, FAST must snag one of the big-three portals in the United States.
AOL has just signed with Google; MSN seems content with Inktomi. In September 2002, however, Yahoo's contract with Google was up — but at press time, it was unclear whether FAST was ready to compete for and win that prize. But Lervik is clear about what he has in mind for FAST's overall future: Having grown up in the fjords and mountains of western Norway, Lervik, an expert cross-country skier, is building his company for stamina as well as speed.
Sidebar: Under the Hood
In its race with Google, FAST is betting the future on its own architecture. Like Google, FAST uses patented algorithms and computer linguistics to search and retrieve. But while both search engines may use the same components, the difference is in how they perform when combined. The real difference, says Danny Sullivan, editor of searchenginewatch.com, is that FAST searches are performed on a core architecture that is infinitely scalable to any amount of data or number of users.
While other search engines use large, expensive multiprocessor computers, FAST queries whiz along a distributed network of search-and-dispatch "nodes," making more-efficient use of existing CPUs. The result: FAST can process a larger number of queries on a smaller number of servers.
Not only is FAST's system more scalable, but FAST's unique architecture also delivers a faster search: A typical query on FAST races through 300 million documents in less than one second.