Meta develops a simultaneous translator of as much as 101 languages ​​with higher precision than present techniques | Technology | EUROtoday

Meta desires to get well the ambition that drove the biblical tower that, in line with the story of the Genesis (11:1-9), humanity needed to construct to succeed in heaven. “They are one people and they all have the same language. (…) Now, nothing they propose will be impossible. Come on, let’s go down and confuse their language there, so that no one understands the other’s language,” Jehovah reacted. Mark Zuckerberg’s company, the multinational of Facebook, Instagram and WhatsApp, wants to avert this curse and maintain leadership in personal communications, for which it has developed, according to what the magazine published today, Wednesday. Naturean artificial intelligence (AI) model capable of instantly translating voice-to-speech or text-to-speech communications and vice versa in up to 101 languages, imitating the voice and tone of the interlocutors.

The model, called SEAMLESSM4T, “outperforms existing systems,” in line with lead researcher Marta Costa-Jussà, from Meta’s synthetic intelligence division (FAIR, Foundational AI Research), and will probably be made out there to the general public so long as it’s not use for industrial functions.

SEAMLESSM4T can recognize up to 101 languages ​​(written or spoken) and is capable of translating them into 36 in voice format and 96 in a text file. According to Costa-Jussà’s results, “it translates with between 8% and 23% more precision [de acuerdo con la Bilingual Evaluation Understudy] than existing systems, can filter out background noise [entre un 42% y un 66% más] and it adjusts to the variation of voices.”

Opposite direction to networks

On the other hand, while Meta has eliminated the data verification system and content moderation on its communication platforms, opening the door to hoaxes, biases and hate speech, with the simultaneous translation system, it has opted for the opposite strategy. and has focused on the “mitigation of toxicity” that can be inferred to the system during machine learning or translation. In this sense, Olga Koreneva Antonova, professor at the Faculty of Translation and Interpreting at the Pablo de Olavide University (UPO), warns that, for example, current computer translators “do not consider gender equality” and tend to substitute the feminine for the masculine because the sources with which one trains already include that bias.

Meta considers toxicity to be profanity or results that may incite hatred, violence, or abuse against a person or group (such as a religion, race, or gender). To mitigate it, it has developed a tool, called Etox, specially trained in toxic elements from speech.

Another limitation that the new system tries to overcome is the scarcity of operating languages. Although more than half of humanity speaks mainly half a dozen languages, the diversity is so wide that the more than 7,000 existing in the world are out of service. The meta model has attempted to alleviate this deficiency by incorporating up to 101 languages, despite the scarcity of audio data and models to incorporate it into AI.

Tanel Alumäe, from the language technology laboratory at Tallinn University (Estonia), stands out in Nature the system’s high capacity to simultaneously translate speech thanks to data from 4.5 million hours of multilingual spoken audio. “This type of training helps the model learn patterns from the data, making it easier to tune for specific tasks without needing large amounts of custom training data,” he explains.

However, in his opinion, “the greatest virtue of this work is not the idea or the proposed method, but the fact that all the data and code to execute and optimize this technology are publicly available, although the model itself only It can be used for non-commercial uses.”

Allison Koenecke, from the Department of Information Sciences at Cornell University, warns, also in Natureof the limitations of these translation systems, despite their progress, in environments where precision is essential, such as in medical or legal activities: “Models such as the one devised by SEAMLESS are accelerating progress in this area, but users of These role models (doctors and court officials, for example) must be aware of the fallibility of speech technologies.”

In this sense, he adds: “This type of machine-induced error could induce real harm, such as wrongly prescribing a medication or accusing a person. And the harm disproportionately affects marginalized populations, who are likely to be poorly heard.”

Koenecke welcomes efforts to eliminate “toxicity” from translations, but advocates “expanding the scope of the linguistic biases studied” and warning users of the possibilities of error.

Reviews

Despite the progress in the translation system, the model raises suspicions among some researchers. One of the most critical is Víctor Etxebarria, professor of Systems Engineering and Automation at the University of the Basque Country (UPV/EHU). “It does not contribute to scientific progress, since, based on what is published, independent specialists do not have permission to reproduce, verify or even improve its technological bases. They only have access to connect to the translator to carry out superficial translations. This software [programa] does not comply with the principles of open source AI, as defined by the Open Source Initiative: use, study, modify and share for any purpose. This translator does not allow this and, therefore, it is not consistent with the principles of open science,” he tells Science Media Center (SMC) Spain.

And even recognizing some advantage as a assist device, the researcher provides: “The product does not prevent translation delays or errors, which it does not correct in real time, as translators do. Another limitation is that it can only be used online through the API (Application Programming Interface) imposed by the company. Overall, the translator is an advanced technological product and probably very useful, but closed to the principles of open science and with multiple technological and legal limitations.

Maite Martín, professor of Computer Science at the University of Jaén and researcher in the SINAI group (INTELLIGENT Information Access Systems), highlights the incorporation of languages ​​with few resources (more minority ones), although at the cost of a higher error rate. “This effort not only improves the accessibility of translation technologies for these communities, but also marks progress in linguistic inclusion by democratizing access to advanced communication tools,” he explains.

Unlike Etxebarria, the researcher does consider that access to the scientific community is guaranteed, and praises “the interaction in real time, the expressiveness of the translated voice and the mitigation of gender biases and toxicity.” “Although SEAMLESSM4T represents a significant advance, there is still work to do to optimize its implementation in practical scenarios,” concludes SMC.

In relation to toxicity, Andreas Kaltenbrunner, leading researcher of the AI ​​and Data for Society group at the UOC, recalls the contradiction of Meta with its recent strategy of suppressing content moderation and promoting it in the translator. “It is commendable that the study includes an analysis of whether the translations increase the toxicity of the texts or how they address possible gender biases. However, it is unfortunate that Meta, the employer of the researchers in this study, appears to have recently decided to abandon efforts in this regard with its new content moderation policy.”

Kaltenbrunner remembers in SMC that the event is a variant of 1 offered in August 2023, however with enhancements within the unification of the use setting, the languages ​​included, the noise filters and the variety of accents.

https://elpais.com/tecnologia/2025-01-15/meta-desarrolla-un-traductor-simultaneo-de-hasta-101-idiomas-con-una-mayor-precision-que-los-sistemas-actuales.html