It has now been a year since the launch of OpenEuroLLM, one of Europe's most ambitious AI initiatives. By uniting 20 leading research institutions and companies, the project has, during its first year, laid the foundation for a new generation of open language models aimed at stronger European digital sovereignty and competitiveness.
During its first year, the OpenEuroLLM project reached crucial milestones in infrastructure, data practices, and model development. The purpose is clear: to develop next-generation open-source language models to advance European AI capabilities.
![]()
OpenEuroLLM is proof that cutting-edge technical expertise combined with a strong European network is a prerequisite for our success in large-scale AI development. For Sweden and our partners, this represents a unique opportunity to build on an open and transparent foundation that strengthens our shared innovative power.
![]()
Nina Ökvist
Head of NLU at AI Sweden
One of the project's most significant successes is the launch of the MixtureVitae dataset. It is the first dataset that is free to use, even for commercial purposes, with performance that matches or exceeds the leading restrictive alternatives on the market. MixtureVitae is particularly strong in code and mathematical reasoning, which is critical for the next step in industrial AI applications.
To meet the challenge of data scarcity for smaller European languages, the project, together with EuroLLM, has developed the first comprehensive multilingual synthetic dataset for pre-training. Within the subproject, called MultiSynt, AI Sweden's NLU team has worked on translating high-quality English data into languages such as Swedish, Icelandic, Hungarian, and Spanish. The work aims to overcome the shortcomings in current data collection methods that limit the accurate representation of many languages.
By creating an open multilingual dataset, the project aims both to enable the training of language models within OpenEuroLLM and to drive research on multilingual models forward. By making these resources available on European supercomputer systems like LUMI and Leonardo, duplication of effort is avoided, and resources can be maximized throughout the ecosystem.
In December 2025, OpenEuroLLM became the first AI project to be granted strategic access to several of EuroHPC's supercomputers simultaneously, including LUMI, Leonardo, Jupiter, and MareNostrum 5. However, additional computational resources will be required to supplement the previous allocations, emphasizes the project's coordinator, Jan Hajič.
![]()
Creating an open source multilingual LLM in the public space and within a large consortium is a challenging task. I am proud that thanks to the expertise, enthusiasm, commitment and hard work of especially the core partners the project has achieved its first-year goals. However, significant challenges, especially in securing more compute for creating the final models, still remain.
![]()
Jan Hajič
Charles University
Over the coming year, AI Sweden will gear up its work to create the conditions for the European models to become practically useful. Within the framework of post-training - the critical phase where models are fine-tuned for their specific purposes - AI Sweden's NLU team is focusing on equipping the models with the capabilities and behaviors required for advanced use. Specifically, this involves optimizing the models' ability to handle long contexts, improving instruction following and chat interaction, and strengthening their capacity for reasoning and function calling.
During the fall, the first language models developed within the OpenEuroLLM framework are scheduled to be published.
European High Performance Computing (EuroHPC) consists of a cluster of large-scale computational infrastructure in Europe. The computational systems are primarily intended for use in academic research. The EuroHPC framework also includes support for research and innovation in the form of calls for proposals in all areas related to large-scale computing, as well as investments in competence centers across Europe designed to facilitate knowledge exchange, innovation, and new research collaborations.
Related articles