GPT-SW3
AI Sweden, together with RISE and WASP WARA Media & Language, have developed a large-scale generative language model for the Nordic languages, primarily Swedish.
GPT-SW3 is the first truly large-scale generative language model for the Swedish language. Based on the same technical principles as the much-discussed GPT-4, GPT-SW3 will help Swedish organizations build language applications never before possible.
Do you want access to GPT-SW3?
We are now releasing GPT-SW3 openly. The models (with 126M, 356M, 1.3B, 6.7B, 20B, 40B parameters) are accessible under an open and permissive license from AI Sweden’s repository on Hugging Face, where we also provide both a model card and a datasheet. Please note that you will need significant computation power to use the model.
- Join the conversation on Discord and keep an eye out for workshops and seminars for deep dives into use cases and collective problem-solving
- Want to read more? Check out our blog, Insights
- Sign up to receive updates on our work with GPT-SW3 and NLU
Q&A
GPT-SW3 is a collection of large decoder-only pretrained transformer language models that are developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. Decoder language models are generative language models built specifically to be able to generate language (GPT stands for Generative Pretrained Transformer). Read more here. GPT-SW3 is built on massive amounts of Swedish, Norwegian, Danish, Icelandic and English text data with the explicit purpose of being able to generate Swedish and Nordic languages texts.
GPT-SW3 is not an off-the-shelf product or service ready for use. Developers must harness its capabilities to construct applications like chatbots or document summarization services.
Organizations can further tailor GPT-SW3 by training it on their own data sets for specific tasks.
With the open release any individual, company, government agency, or organization can tap into the power of GPT-SW3 to build products and services.
You can find the model at AI Sweden's repository in Hugging Face.
You can use GPT-SW3 for any type of task that is a good fit for an LLM: Text analysis solutions, classification tasks, content generation or moderation — GPT-SW3 can handle a vast array of language-based applications.
No. GPT-SW3 is not a product and AI Sweden doesn't have the resources to provide support. However, there’s a vibrant Nordic developer community on Discord’s AI Nordics server, with a channel dedicated to GPT-SW3. There is also a developer community on Hugging Face.
Facts
Project partners: AI Sweden, RISE and WASP WARA Media & Language.
The current GPT-SW3 models are trained on Linköpings University’s supercomputer, Berzelius, using the Nemo Megatron framework from NVIDIA.