Google is flexing its artificial intelligence muscle to help users of its search engine research complex tasks that would normally involve multiple queries.
Many of the Google searches we do are just a single query, such as “file a request for extension federal tax.” But other searches involve several searches about different aspects of a complex task. You might, for example, want to know how to prepare for a river rafting trip in Montana in August, and how the preparations might differ from the preparations you did before your Colorado River rafting trip last fall.
If you asked a local rafting expert how to prepare you might get an extended answer that covers a range of relevant questions. Will the weather be hotter than it was in Colorado? What clothing and gear will we need? Where can we rent the raft? That’s the kind of expertly curated answer Google wants to deliver to search users, with the help of some very leading edge natural language processing.
Google researchers shook the natural language world in 2018 with the development of a natural language model called BERT (Bidirectional Encoder Representations from Transformers). BERT was trained in a new and unique way. Instead of only feeding the neural network text examples labeled with their meaning, Google researchers started by feeding BERT huge quantities of unannotated text (11,038 digitized books and 2.5 billion words from Wikipedia). The researchers randomly masked certain words in the text, and challenged the model to work out how to fill them in. The neural network analyzed the training text and found patterns of words and sentences that often appeared in the same context, helping it understand the basic relationships between words. And, in the process, it learned a good deal of basic knowledge about the world and how it works.
Now Google AI has built a new model based on BERT to manage complex searches–Multitask Unified Model (MUM). Google says MUM is a thousand times more powerful that BERT; that is, it contains a thousand times the number of nodes–the decision points in a neural network whose design is based on the nerve junctions in the human brain. Google says MUM is trained using data crawled from the open web, with low quality content removed.
So for a complex query like the rafting trip example above, MUM might deliver more than just a list of links that match up with the key words in the query. MUM can generate language, so the user might get a narrative resembling something a human subject expert would say. MUM is also multimodal, so the narrative may come with visual aids–images and videos from around the web, for example. And it would include links to other relevant content (information on what kind of training to do before the rafting trip, perhaps).
Google is doing internal pilots to better understand the types of queries MUM might be able to solve.
Google says it’ll use human raters to carefully oversee the search results MUM is generating, watching for signs that bias may have been introduced in the training data. Large neural networks like that underpinning MUM require massive amounts of computing power for their training; Google says it will apply its latest learnings on how to reduce the carbon footprint of all the servers that are used.
You won’t see the full package of AI-curated search results any time soon. MUM is still in its experimental stages. Right now, Google is doing internal pilots to better understand the types of queries MUM might be able to solve. But Google also says MUM will begin powering certain search features in the coming months.
If MUM is as transformative as the company suggests, the very definition of search may evolve as natural language processing and other forms of AI play bigger roles.”Search” might begin to look more like “research.” Instead of just a smart gopher that knows where everything is on the web, Google may act more like a research assistant with subject matter expertise.
Google announced its work on MUM at its annual I/O developer event on Tuesday.