What is GPT-SWE?
GPT stands for Generative Pre-trained Transformer. GPT-SWE is the first truly large-scale generative language model for the Swedish language.
What is a “generative language model”?
In AI, generative models are models that can create new data. There are also discriminative models, which can be used to tell apart different data points. So a generative language model is a model that can create text in human language.
Are there other generative language models around?
Yes, for sure. Last year, OpenAI’s GPT-3 created a lot of buzz around the world. Within and outside the AI community, people were amazed by what GPT-3 could do. Examples ranged from writing poetry and generating HTML code to answering philosophical questions on the meaning of life.
Why do generative language models matter?
The early GPT-based models we see now are the first steps towards a world where you can talk to a computer in natural language and get instantaneous output that would normally require you to hire experts. In concrete terms, with these first models, you will be able to do summarization, classification, and other text-related tasks that you can formulate in natural languageーwithout spending a lot of money on annotating data and training new models.
Before GPT-3 there was GPT-2, and it’s not a wild guess that there is a GPT-4 in the works. Why do we need a GPT-SWE?
GPT-SWE has been trained specifically to generate Swedish text. While a lot of what computers do is generally applicable around the world regardless of which country you are in, with language models, things are different.
Each human language needs its own language models. The more widely a language is used, the more interest there is in it from the research community and private companies.
This means that we cannot expect the international community to provide a large-scale model for Swedish. We need to do this ourselves.