HiTZ has found an innovative way to create chattbots in Basque and other small languages
The Centre HiTZof the University of the Basque Country/Euskal Herriko Unibertsitatea has created a new way of doingthe\u00A0 chat in small languages such as the Euskera, based on the multilingual open language model built by the Meta Research Center, Llama.
The usual way would be to feed Llama with texts and examples in Basque, but that manual work is very expensive. "Only big companies have been able to do it so far," explainsEneko Agirre, director of the research center HiTZ.In order to avoid this work, HiTZ members have found an "innovative and efficient way" to adapt the chat to the Euskera. With the new method, it is enough to continue training with the Euskera text mass Llama , but the key to this is to be able to apply techniques to deal with the problem known as "catastrophic oblivion".
The work done opens up new avenues. On the one hand, the method itself can be applied to open models stronger than the Llama, and on the other hand, it can be done in other languages with a similar volume of text.
In fact, with regard to the number of documents on the open Internet, there are 1000 times more documents in English than in Basque, and 100 times more in Spanish. So far, the question has been whether conversations with chattbots in small languages can achieve the same good results as those in English or Spanish.
You might like
Meta has been sentenced to pay $479 million to the digital press for unfair competition
Madrid's 15th Commercial Court has sentenced Meta to compensate 87 AMI publishers for gaining a competitive advantage through behavioral advertising on Facebook and Instagram.
X, ChatGPT and League of Legends are down due to a mistake by the company Cloudfare
The affected services show the following message: "Please unlock challenges.clouflare.com"
Elon Musk launches "Grokipedia," an encyclopedia of AA in which they praise Trump and Euskadi does not exist
The new platform promoted by Elon Musk has produced nearly a million articles through artificial intelligence, but with a strong ideological bias. In its current version, Grokipedia does not include the term "Euskadi," but devotes 20,000 words to Donald Trump (equivalent to an 80-page book) and offers partial views on Franco, AIDS or slavery.
Is it worth buying a high-definition TV?
The answer is no, Cambridge University researchers say the human eye has a limit of resolution.
YouTubeko milaka bideoren bidez "malware"a eskala handian banatzeko operazio bat identifikatu dute
Gaizkileek YouTubeko kontu faltsuak edo konprometitutakoak erabiltzen zituzten softwarea pirateatu eta jokoak hackeatzeko bideoak argitaratzeko, balizko biktimak erakartzeko amu gisa. Programa horiek lortzeko fitxategiak deskargatzea eskatzen zuten, baina artxibo horiek, benetan, gailua kutsatzen zuen malwarea zuten.
Amnesty International has accused TikTok of "inciting the insecurity of teenagers to earn money."
A report by the organization warns that the social network algorithm explains content about self-use and suicide to young people. The AI calls on the European Commission and the French Government to take urgent measures to protect minors.
Amazon restablece su servicio tras la caída que afectó a millones de usuarios
La compañía anunció que la mayoría de los servicios comenzaron a recuperar a lo largo del día y que el funcionamiento de su red regresó a la normalidad al cererre de la jornada.
What is Amazon Web Services?
The cloud service provider fell this Monday, October 20. Thousands of platforms have been cancelled worldwide, but... what is Amazon Web Services and why is it so important on the Internet?
Amazon's "cloud" has collapsed and affected websites and services around the world
A massive malfunction on Amazon Web Services (AWS) servers has left thousands of websites worldwide out of service.
Instagram has connections problems around the world
This Thursday morning, the Meta social network is experiencing international connection problems that are making it difficult for users to update their content.