Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America | EUROtoday
Latam-GPT is new massive language mannequin being developed in and for Latin America. The undertaking, led by the nonprofit Chilean National Center for Artificial Intelligence (CENIA), goals to assist the area obtain technological independence by creating an open supply AI mannequin educated on Latin American languages and contexts.
“This work cannot be undertaken by just one group or one country in Latin America: It is a challenge that requires everyone’s participation,” says Álvaro Soto, director of CENIA, in an interview with WIRED en Español. “Latam-GPT is a project that seeks to create an open, free, and, above all, collaborative AI model. We’ve been working for two years with a very bottom-up process, bringing together citizens from different countries who want to collaborate. Recently, it has also seen some more top-down initiatives, with governments taking an interest and beginning to participate in the project.”
The undertaking stands out for its collaborative spirit. “We’re not looking to compete with OpenAI, DeepSeek, or Google. We want a model specific to Latin America and the Caribbean, aware of the cultural requirements and challenges that this entails, such as understanding different dialects, the region’s history, and unique cultural aspects,” explains Soto.
Thanks to 33 strategic partnerships with establishments in Latin America and the Caribbean, the undertaking has gathered a corpus of information exceeding eight terabytes of textual content, the equal of hundreds of thousands of books. This info base has enabled the event of a language mannequin with 50 billion parameters, a scale that makes it corresponding to GPT-3.5 and provides it a medium to excessive capability to carry out advanced duties akin to reasoning, translation, and associations.
Latam-GPT is being educated on a regional database that compiles info from 20 Latin American international locations and Spain, with a powerful whole of two,645,500 paperwork. The distribution of information reveals a big focus within the largest international locations within the area, with Brazil the chief with 685,000 paperwork, adopted by Mexico with 385,000, Spain with 325,000, Colombia with 220,000, and Argentina with 210,000 paperwork. The numbers mirror the scale of those markets, their digital improvement, and the provision of structured content material.
“Initially, we’ll launch a language model. We expect its performance in general tasks to be close to that of large commercial models, but with superior performance in topics specific to Latin America. The idea is that, if we ask it about topics relevant to our region, its knowledge will be much deeper,” Soto explains.
The first mannequin is the start line for creating a household of extra superior applied sciences sooner or later, together with ones with picture and video, and for scaling as much as bigger fashions. “As this is an open project, we want other institutions to be able to use it. A group in Colombia could adapt it for the school education system or one in Brazil could adapt it for the health sector. The idea is to open the door for different organizations to generate specific models for particular areas like agriculture, culture, and others,” explains the CENIA director.
https://www.wired.com/story/latam-gpt-the-free-open-source-and-collaborative-ai-of-latin-america/