In a separate dialog box, Bing Chat correctly identified Thailand as the rumored location for the TV show’s next setting white Lotusbut provided “somewhere in Asia” when the query was translated into Spanish, says Solis, who runs the consulting firm Orainti that helps websites drive more visits from search engines.
Executives at Microsoft, OpenAI and Google who work on chatbots have said users can counter bad answers by adding more detailed instructions to their questions. Without explicit guidance, the tendency for chatbots to fall back on English speech and perspectives can be strong. Just ask Veruska Anconitano, another search engine optimization expert who splits her time between Italy and Ireland. She found that asking Bing chat questions in Italian yielded answers in English unless she specified “Answer me in Italian.” In another chat, Anconitano says, Bing assumed she wanted to display the Japanese prompt 元気ですか (“How are you?”) in English instead of continuing the conversation in Japanese.
Recent research papers have validated the anecdotal findings of people who run into the limits of Bing Chat and its brethren. Zheng-Xin Yong, a doctoral student at Brown University who also studies multilingual language models, says he and his collaborators found in a study that generating better answers to Chinese questions requires asking them in English instead of Chinese.
When Fung in Hong Kong and her collaborators tried to ask ChatGPT to translate 30 sentences, 28 correctly rendered from Indonesian to English, but only 19 the other way, suggesting that monoglot Americans turning to the bot to making deals with Indonesian merchants would have difficulty . The same limited fluency in one direction was found to be repeated in at least five other languages.
The language problems of large language models make them hard to trust for anyone beyond English, and maybe Chinese. When I tried translating ancient Sanskrit hymns via ChatGPT as part of an experiment to use AI to speed up wedding planning, the results seemed plausible enough to add to a ceremony script. But I had no idea if I could rely on them or be laughed off stage by the elderly.
Researchers who spoke to WIRED do see some signs of improvement. When Google created its PaLM 2 language model, released this month, it attempted to increase non-English training data for more than 100 languages. The model recognizes idioms in German and Swahili, jokes in Japanese and cleans grammar in Indonesian, Google says, and it recognizes regional variations better than previous models.
But in consumer services, Google keeps PaLM 2 in a cage. The Bard chatbot is powered by PaLM 2, but only works in US English, Japanese, and Korean. A writing assistant for Gmail using PaLM 2 only supports English. It takes time to officially support a language by running tests and applying filters to ensure the system doesn’t generate toxic content. Google hasn’t made a big investment to launch many languages from the start, though it’s working to add more soon.