On Monday, OpenAI held its “spring update” product reveal, generating the kind of excitement that’s usually reserved for Apple or Tesla launches. The hope was that the Sam Altman-helmed AI powerhouse would reveal its next-generation model, GPT-5 — but it was not to be. Instead, we got a smaller update called GPT-4o that makes the current model faster and easier to access. 

Among other things, that means GPT-4o is much better at extracting data from pictures, audio, and video — but the star of the show is voice control, which lets you talk to GPT-4o the way you’d talk to Apple’s Siri or Amazon’s Alexa. GPT-4o goes beyond those assistants with more realistic vocal inflections and a level of emotional expressiveness that was drawing a lot of comparisons to the Scarlett Johansson-voiced AI assistant in 2013’s Her. Altman himself encouraged the comparisons, although not everyone thought it was a good look.

If OpenAI can build a universal translator, it will be a company-defining product.

But to my eyes, the most interesting moment was a live translation demo, in which chief technology officer Mira Murati spoke to GPT-4o in Italian, and the model provided a real-time summary of what she was saying, translated into English. It was a stunning demo; this kind of real-time Babel Fish translation has been a holy grail in tech for a long time, and the appeal isn’t hard to grasp. Google even demoed a similar product at the I/O developer conference in 2022. But that product was bundled into a set of smart glasses and never quite materialized. If OpenAI can build a universal translator — and make it cheap, reliable, and freely available — it will be a company-defining product.

The question is whether OpenAI can actually deliver the universal translator we saw on stage. As the name suggests, large language models are trained by analyzing a large data set of content in a particular language. That’s easy enough in English, the dominant language of the internet. But a universal translator needs to be proficient in every language, and finding training data in other languages is a long-standing problem. Even popular languages like Spanish are — compared to the number of people who speak the language — relatively underrepresented on the internet. If national writers’ groups start to refuse access to their work, as we saw recently in Singapore, the problem could get even worse.

We’ve already seen ChatGPT struggle with global languages. When Rest of World tested ChatGPT in September, it struggled doing basic math problems in low-resource languages like Tamil, Bengali, and Kurdish. As of March 2023, OpenAI’s own benchmarks showed a similar issue with GPT-4, with test scores dropping noticeably as it moves into more obscure languages. That bias can do real harm, particularly when machine translations end up in use in high-stakes use cases like conflict moderation or asylum law. The models that handle low-resource languages the best, like Ethiopia’s Lesan, tend to take a language-specific approach — a different approach from the multi-language models built by OpenAI.

Like what you're reading? Sign up to our newsletter featuring the latest on U.S. tech giants and their impact outside the West.

To OpenAI’s credit, they seem to be aware of the problem and doing their best to fix it. Translation is a simpler task than doing calculations in a foreign language, and in my own informal testing, GPT-4o did a pretty good job translating and summarizing this Rest of World article in Nepali. But recent studies show that even the most advanced models make significantly more mistakes when translating from Chinese to English than the other way around. And the scarcity of training data means foreign-language skills may improve more slowly than functions like data extraction or summarization.

Before Monday, those language issues weren’t a particularly urgent issue for OpenAI. But now the company is actively framing GPT-4o as a translation service, which raises real questions about how it’ll hold up in widespread use. It also comes at an unusually high-stakes time for OpenAI: Within 24 hours of the demo, co-founder Ilya Sutskever announced he was leaving the company and Google released its own AI voice assistant (sans translation features). Monday’s demo was stunning, but it’s one OpenAI may regret if the technology can’t deliver.

Source: https://restofworld.org/2024/exporter-openai-translation-gpt4o/