With the development of GPT-SW3, AI Sweden together with partners is building the first truly large-scale generative language model for the Swedish language. The model is now ready for partners to test, in a joint effort to validate the model in real-life applications.
In December 2021, AI Sweden presented GPT-SW3, with approximately 3.5 billion parameters. GPT-SW3 builds on 106 GB Swedish text and has been trained in a collaborative effort between AI Sweden and RISE on Linköpings University’s supercomputer, Berzelius, using the Megatron framework from NVIDIA.
Building large scale language models for the Swedish and other Nordic languages can only be achieved in a collaborative and joint effort of different actors who contribute with competence, data, and computing power. During summer 2022, AI Sweden will in collaboration with WASP WARA Media & Language and RISE train a significantly larger GPT-SW3 model with up to 175 billion parameters and around 1 TB of text data, primarily in Swedish, but also in Norwegian, Danish, Icelandic, and English.
Follow the process, the training, and our NLU team’s reflections on our brand new Medium page.
As the size of the model grows, it will become more competent in all kinds of language processing tasks (e.g. text classification, information extraction, question answering, summarization, translation, text generation, idea generation, and so on). Very large language models have been shown to exhibit zero-shot capacity, which means that the model can solve tasks even without being specifically trained for the task. These models have the potential to revolutionize the ways in which text data can be processed, analyzed and utilized. They open up for completely new ways of creating societal, citizen and customer benefits.
One important step in the development of very large language models is testing and validation, which also requires a communal effort. This includes exploring the possibilities and limitations of the models (e.g. biases in the training data of the models can have negative effects in certain use-cases). We therefore now invite AI Sweden’s partners to test and validate the first iteration of GPT-SW3 and to share their feedback with us.
In this way, we can together accelerate the use of AI for the benefit of our society, our competitiveness, and for everyone living in Sweden.
We need more hands on deck to explore and test GPT-SW3. What challenges would you like to try to solve with GPT-SW3? Partners can get access by filling in this simple form.
For more information, don't hesitate to get in touch with Ariel Ekgren
→ The NLU Team on Medium
→ What's the latest with the GPT-SWE? From the presentation in December.
→ More technical details on Hugging Face
We also recommend that you join the AI Nordics Discord channel. There you'll find a channel dedicated to GPT-SW3 in which you can get help and share your learnings.