Skip to main content

Models & Resources

AI Sweden’s strategy for large language models follows two complementary paths: Collaborating with other leading European organizations to develop large-scale open models from scratch. And rapidly fine-tune existing open models to meet specific application needs at Swedish organizations.

Models

We are excited to share that all our models can be found in AI Sweden’s model library on HuggingFace! We invite you to explore and discover the models we have developed. Take a moment to learn more about each one below. Plus, stay tuned for more models that we will be releasing soon.

Llama 3

This is a variant of Meta’s open language model Llama 3. Using the Nordic Pile training material compiled during the development of GPT-SW3, the 8-billion-parameter version of Llama 3 has been trained to better handle Nordic languages.

Translation

The translation model is based on GPT-SW3 and handles translations between Swedish and English. It has been trained on a DGX machine from Aixia using translation data developed by AI Sweden. This model is especially useful for contexts requiring the translation of large volumes of text.

Tyr

Tyr is an innovative model in the legal field and the first of its kind for the Swedish language. Named after the Norse god of justice, Tyr is a "model merging" of a Swedish Mistral model with the English legal language model Saul

This merging has resulted in a model that can answer basic legal questions in Swedish, even though it has not been specifically trained on Swedish legislation. 

With further fine-tuning, Tyr has the potential to offer more precise answers within the Swedish legal context, paving the way for AI-supported legal advice and system use.

RoBERTa

The RoBERTa model is an enhancement of Meta's RoBERTa-large and has been trained on Intel’s Gaudi accelerator. AI Sweden’s team used the Nordic Pile dataset, which was developed during the work on GPT-SW3. Despite its relatively modest size of 335 million parameters, RoBERTa is a powerful model that can be tailored for specific tasks like sentiment analysis, named entity recognition (NER), and semantic search (such as an encoder model in a RAG system). As of mid-May 2024, AI Sweden's Swedish RoBERTa holds a top ranking on ScandEvals for encoder models.

Scrabble tiles falling and spelling out GPT-SW3
 

GPT-SW3

AI Sweden, together with RISE and WASP WARA Media & Language, have developed a large-scale generative language model for the Nordic languages, primarily Swedish.

On using these models

AI Sweden does not offer technical support. These models require organizations to implement any necessary guardrails and customizations for their specific applications. AI Sweden does not claim the models are without flaws, they might hallucinate and may misconstrue information or misinterpret it. AI Sweden does not warrant or guarantee the results from the model can be used as advice, legal or otherwise, or are factually true.

More Resources

Handbook for prompting

With this handbook, we want to give an overall explanation of what a language model is and demonstrate its great potential. The handbook can be used as a tool to successfully create prompts on your own and develop and improve specific applications.
The handbook is a work-in-progress product and will be updated during the course of the GPT-SWE Validation project.

NLP Seminar Series

NLP Seminar Series is a bi-weekly forum for people who work with or are interested in Natural Language Processing (NLP) and language technologies. The seminars are organized by RISE NLP group and AI Sweden. Head on over to our channel on YouTube to watch the latest uploads.

Status update: Region Halland om GPT-SW3 valideringsprojekt: GPT-SW3 tillämpningar inom sjukvården

Presentation at the reference group meeting “Offentlig sektor och tillämpad språkteknologi” at AI Sweden
Niclas Hertzberg & Anna Lokrantz (AI Sweden & Region Halland)

June 8, 2023

Framtidens Digitala Assistent för Offentlig Sektor

Presentation at the reference group meeting “Offentlig sektor och tillämpad språkteknologi” at AI Sweden
Jonatan Permert, AI Sweden

June 8, 2023

Training Material for the Interdisciplinary Expert Pool for NLU

This training material provides introductory resources for anyone who would like to gain knowledge of Large Language Models, with special attention to the ethical aspects of this technology. The material is aimed at both a technical and a non-technical audience. It consists of both essential and more advanced resources in the form of readings, podcasts, open courses, videos and book.

The material is available on our online community MyAI. Create a free account to gain access to the material.

Alternatively, the material is available as a presentation in the link below 'To the presentation'.

This list has been compiled for the Interdisciplinary Expert Pool for NLU project.

Status Update: Data Readiness Lab, June 2022

Presentation at the reference group meeting "Offentlig sektor och tillämpad språkteknologi" at AI Sweden
Felix Stollenwerk, AI Sweden

June 8, 2022

Status Update: Language Models for Swedish Authorities, June 2022

Presentation at the reference group meeting "Offentlig sektor och tillämpad språkteknologi" at AI Sweden
Magnus Sahlgren, AI Sweden

June 8, 2022