GPT-SW3 validation project
Can a large-scale Swedish generative language model reduce the threshold for using advanced language models to solve text processing tasks across a variety of sectors and use cases?
Final Seminar
On September 26, 2024, a final seminar for the validation project was held at AI Sweden's office in Stockholm. Watch the recording (in Swedish):
Challenges
In a number of projects, especially “Language Models for Swedish Authorities,” it has become clear that there is a tangible need for powerful Swedish language models, both in the public and private sectors. However, it is challenging for individual actors to adapt and deploy models for their organizational needs due to difficulties in attracting skilled staff, poor data readiness, and acquiring suitable hardware.
AI Sweden believes that foundational models constitute a promising alternative to tackle these challenges. They have the potential to transform traditional ways of solving text processing needs.
Project purpose
This project tests the hypothesis that foundational models can reduce the threshold for using advanced language models to solve text-processing tasks in the private and public sectors.
It investigates the possibility of using a large-scale Swedish generative language model to create a general solution for text-processing tasks that can be used by many different stakeholders. The project is based on the large-scale generative language model GPT-SW3, which AI Sweden is developing in collaboration with RISE, WASP WARA Media & Language and NVIDIA. The project will make GPT-SW3 available via an API and user-friendly web-based interface, develop solutions for text processing tasks (e.g. through prompting and p-tuning), and validate the use of the model across various use cases, with clear need owners from the public sector, industry, and academia.
The key innovations of the project are partly the API and the web-based interface for the model, and partly the validation of using the model to solve a large number of different text-processing tasks.
How?
The project brings together stakeholders from the public sector, industry, and academia. AI Sweden and RISE will develop the API solution as the first step of the project. In the next step, the project focuses on different use cases in the stakeholders’ organizations.
These use cases focus on text processing applications such as text generation and summarization, categorization, and information extraction.
The work will take place iteratively: AI Sweden and RISE validate and improve the API solution with the help of results from the various use cases. The stakeholders are responsible for their respective use cases with the support of AI Sweden and RISE.
Expected outcomes
If this project succeeds it will drastically reduce the threshold for using advanced language models to solve text-processing tasks. This will lead to a disruptive change and acceleration in the use of AI-driven Swedish language technology. Stakeholders will no longer need to invest in costly processes of collecting training data for their specific needs, nor will they have to purchase and operate expensive hardware resources. The results and resources produced within this project will be applicable in virtually all sectors and industries. It is hoped that the technology can be used as broadly as possible.
-
Reduce the threshold for using advanced language models to solve text-processing tasks in the public and private sector
-
Cost and time savings for stakeholders
-
Better services for citizens and customers
-
Increase AI competence in the field of foundation models which is valuable for Swedish competitiveness and societal benefit
Handbook for prompting
With this handbook, we want to give an overall explanation of what a language model is, an introduction to Prompt Engineering, and demonstrate its great potential by going through some simple applications. The handbook can be used as a tool to successfully create prompts on your own and hopefully develop and improve specific applications.
The manual is a work-in-progress product and will be updated during the course of the project.
Status Update: GPT-SW3: a foundational model for Swedish NLP
Presentation at the reference group meeting "Offentlig sektor och tillämpad språkteknologi" at AI Sweden
Magnus Sahlgren, AI Sweden
June 8, 2022
Status update: GPT-SW3 – En svensk basmodell för texthantering / valideringsprojekt
Presentation at the reference group meeting “Offentlig sektor och tillämpad språkteknologi” at AI Sweden
Magnus Sahlgren, AI Sweden
December 15, 2022