Google Translate is the world’s most popular web translation platform, but one Stanford University researcher says it doesn’t really understand sex and gender. Londa Schiebinger, who runs Stanford’s Gendered Innovations project, says Google’s choice of source databases causes a statistical bias toward male nouns and verbs in translation. In a paper on gender and natural language processing, Schiebinger offers convincing evidence that the source texts used with Google’s translation algorithms lead to unintentional sexism.
Machine Translation And Gender
In a peer-reviewed case study published in 2013, Schiebinger illustrated that Google Translate has a tendency to turn gender-neutral English words (such as the, or occupational names such as professor and doctor) into the male form in other languages once the word is translated. However, certain gender-neutral English words are translated into the female form . . . but only when they comply with certain gender stereotypes. For instance, the gender-neutral English terms a defendant and a nurse translate into the German as ein Angeklagter and eine Krankenschwester. Defendant translates as male, but nurse auto-translates as female.
Where Google Translate really trips up, Schiebinger claims, is in the lack of context for gender-neutral words in other languages when translated into English. Schiebinger ran an article about her work in the Spanish-language newspaper El Pais into English through Google Translate and rival platform Systran. Both Google Translate and Systran translated the gender-neutral Spanish words “suyo” and “dice” as “his” and “he said,” despite the fact that Schiebinger is female.
These sorts of words bring up specific issues in Bing Translate, Google Translate, Systran, and other popular machine translation platforms. Google engineers working on Translate told Co.Labs that translation of all words, including gendered ones, is primarily weighed by statistical patterns in translated document pairs found online. Because “dice” can translate as either “he said” or “she said,” Translate’s algorithms look at combinations of “dice” in conjunction with neighboring words to see what the most frequent translations of those combinations are. If “dice” renders more often in the translations Google obtains as “he says,” then Translate will usually render it male rather than female. In addition, Google Translate’s team added that their platform only uses individual sentences for context. Gendered nouns or verbs in neighboring sentences aren’t weighed in terms of establishing context.
Source Material, Cultural Context, And Gender
Schiebinger told Co.Labs that the project evolved out of a paper written by a student who was working on natural language-processing issues. In July 2012, a workshop was held at Stanford University with outside researchers that was turned, post-peer review, into the machine translation paper.
Google Translate, which faces the near-impossible goal of accurately translating the world’s languages in real time, has faced gender issues for years. To Google’s credit, Mountain View regularly tweaks Google Translate’s algorithms to fix translation inaccuracies. Language translation algorithms are infamously tricky. Engineers at Google, Bing, Systran, and other firms don’t only have to take grammar into account–they have to take into account context, subtext, implied meanings, cultural quirks, and a million other subjective factors . . . and then turn them into code.
But, nonetheless, those inaccuracies exist–especially for gender. In one instance last year, users discovered that translating “Men are men, and men should clean the kitchen” into German became “Männer sind Männer, und Frauen sollten die Küche sauber”–which means “Men are men and women should clean the kitchen.” Another German-language Google Translate user found job bias in multiple languages–the gender-netural English language terms French teacher, nursery teacher, and cooking teacher all showed up in Google Translate’s French and German editions in the feminine form, while engineer, doctor, journalist, and president were translated into the male form.
