If you are translating text or speech, it seems obvious that you should read a whole sentence before figuring out what it means. But this hasn’t been so easy for computers—in part because the work sucks up so many resources. So Google Translate has had to get by with looking at pieces of sentences, words, and phrases, and translating them individually. On Tuesday, Google announced a new system called Google Neural Machine Translation (GNMT) that works on whole sentences and improves accuracy about 60% on average over the old phrase-based machine translation (PBMT), including on notoriously difficult Chinese-to-English translations. This is the first translation to roll out to the Google Translate mobile and web apps, available now.
The science behind Google’s news gets pretty complicated pretty quickly in the search giant’s full research paper. But the gist is: Google’s new GNMT system speeds things up by getting a tad sloppy with the math and running the work in parallel over the many cores in computer graphics chips (GPUs). Neural networks roughly mimic how the brain works by refining data through successive layers of processing unites (neurons). With the efficiency improvements, GMNT can stack up eight layers to work on decoding each sentence in one language and eight layers to compose a new sentence in the other language. It provides a limited amount of leniency for error in each neural layer in order to support more layers and ultimately get better results.
No matter how well you train artificial intelligence translation on a language, it won’t learn every esoteric word of understand all names. In the past, Google just copied the word over to the translation, in the hope that it would be the same in each language. But now it breaks words it doesn’t know into pieces that it might be able to figure out, which Google says makes it less likely to flub difficult words.
That said, Google admits that it still makes plenty of mistakes, like dropping words from the original language. Also, reading whole sentences is an improvement, but Google still isn’t that good at seeing how sentences relate to each other, so whole translated paragraphs might read pretty clunky.