Models & Resources
AI Sweden’s strategy for large language models follows two complementary paths: Collaborating with other leading European organizations to develop large-scale open models from scratch. And rapidly fine-tune existing open models to meet specific application needs at Swedish organizations.
Models
We are excited to share that all our models can be found in AI Sweden’s model library on HuggingFace! We invite you to explore and discover the models we have developed. Take a moment to learn more about each one below. Plus, stay tuned for more models that we will be releasing soon.
Llama 3-8B Instruct
AI-Sweden-Models/Llama-3-8B: The NLU team has developed the model, Llama 3-8B Instruct. The main difference between the previous model and this one is that the Instruct model can follow instructions and engage in dialogue with the user. The model has ranked high on ScandEval's list of Swedish models and, despite its size, is on par with Chat-GPT 4.
The model has been trained on the LUMI supercomputer in Finland, which is one of the fastest supercomputers in the world. Access to the supercomputer comes from the EU-funded project DeployAI, where AI Sweden's role is to experiment with and validate services for large language models.
The training took 1.5 epochs and was done on 8 nodes - equivalent to a complete tidal cycle or 12 hours. The dataset was developed by 42labs, community members at AI Sweden.
Versions:
https://huggingface.co/AI-Sweden-Models/Llama-3-8B-instruct
https://huggingface.co/AI-Sweden-Models/Llama-3-8B-instruct-bf16-gguf
https://huggingface.co/AI-Sweden-Models/Llama-3-8B-instruct-q8-gguf
https://huggingface.co/AI-Sweden-Models/Llama-3-8B-instruct-Q4_K_M-gguf
https://huggingface.co/AI-Sweden-Models/Llama-3-8B-instruct-Q3_K_M-gguf
Llama 3
This is a variant of Meta’s open language model Llama 3. Using the Nordic Pile training material compiled during the development of GPT-SW3, the 8-billion-parameter version of Llama 3 has been trained to better handle Nordic languages.
Translation
The translation model is based on GPT-SW3 and handles translations between Swedish and English. It has been trained on a DGX machine from Aixia using translation data developed by AI Sweden. This model is especially useful for contexts requiring the translation of large volumes of text.
Tyr
Tyr is an innovative model in the legal field and the first of its kind for the Swedish language. Named after the Norse god of justice, Tyr is a "model merging" of a Swedish Mistral model with the English legal language model Saul.
This merging has resulted in a model that can answer basic legal questions in Swedish, even though it has not been specifically trained on Swedish legislation.
With further fine-tuning, Tyr has the potential to offer more precise answers within the Swedish legal context, paving the way for AI-supported legal advice and system use.
RoBERTa
The RoBERTa model is an enhancement of Meta's RoBERTa-large and has been trained on Intel’s Gaudi accelerator. AI Sweden’s team used the Nordic Pile dataset, which was developed during the work on GPT-SW3. Despite its relatively modest size of 335 million parameters, RoBERTa is a powerful model that can be tailored for specific tasks like sentiment analysis, named entity recognition (NER), and semantic search (such as an encoder model in a RAG system). As of mid-May 2024, AI Sweden's Swedish RoBERTa holds a top ranking on ScandEvals for encoder models.
On using these models
AI Sweden does not offer technical support. These models require organizations to implement any necessary guardrails and customizations for their specific applications. AI Sweden does not claim the models are without flaws, they might hallucinate and may misconstrue information or misinterpret it. AI Sweden does not warrant or guarantee the results from the model can be used as advice, legal or otherwise, or are factually true.
More Resources
Handbook for prompting
With this handbook, we want to give an overall explanation of what a language model is and demonstrate its great potential. The handbook can be used as a tool to successfully create prompts on your own and develop and improve specific applications.
The handbook is a work-in-progress product and will be updated during the course of the GPT-SWE Validation project.
NLP Seminar Series
The NLP Seminar Series was a bi-weekly forum for people who work with or are interested in Natural Language Processing (NLP) and language technologies. While we don't have any upcoming seminars planned, you can head over to our YouTube channel to watch the recordings.
The seminars were organized by RISE NLP group and AI Sweden.
Status update: Region Halland om GPT-SW3 valideringsprojekt: GPT-SW3 tillämpningar inom sjukvården
Presentation at the reference group meeting “Offentlig sektor och tillämpad språkteknologi” at AI Sweden
Niclas Hertzberg & Anna Lokrantz (AI Sweden & Region Halland)
June 8, 2023
Framtidens Digitala Assistent för Offentlig Sektor
Presentation at the reference group meeting “Offentlig sektor och tillämpad språkteknologi” at AI Sweden
Jonatan Permert, AI Sweden
June 8, 2023
Training Material for the Interdisciplinary Expert Pool for NLU
This training material provides introductory resources for anyone who would like to gain knowledge of Large Language Models, with special attention to the ethical aspects of this technology. The material is aimed at both a technical and a non-technical audience. It consists of both essential and more advanced resources in the form of readings, podcasts, open courses, videos and book.
The material is available on our online community MyAI. Create a free account to gain access to the material.
Alternatively, the material is available as a presentation in the link below 'To the presentation'.
This list has been compiled for the Interdisciplinary Expert Pool for NLU project.
Status Update: Data Readiness Lab, June 2022
Presentation at the reference group meeting "Offentlig sektor och tillämpad språkteknologi" at AI Sweden
Felix Stollenwerk, AI Sweden
June 8, 2022
Status Update: Language Models for Swedish Authorities, June 2022
Presentation at the reference group meeting "Offentlig sektor och tillämpad språkteknologi" at AI Sweden
Magnus Sahlgren, AI Sweden
June 8, 2022