The webinar was organized to share results and learnings from the project, where AI experts, academia and the public sector in close collaboration have developed language models, use cases, and data sharing insights. The results contribute to the development and application of Swedish NLP and have laid the foundation for AI Sweden’s continued strategic program in applied language technology. Swedish is a relatively small language, and collaborations that contribute to the development specifically in Swedish is therefore key for the accelerated application of AI nationwide.
The overall goal of the Swedish Language Data Lab was to create a national knowledge hub within applied language technology, and thereby accelerate innovation, research, and applications in this area. The project was part of Vinnova’s Data-driven innovation funding programme which aims to increase the level of expertise in reusing data in innovations in Sweden.
During the panel discussion, the importance of the legal aspects of NLP was highlighted, as well as the increased understanding of it as a result of the project. It also became clear that there is a need for guidance and clarification on how language technology and the GDPR should interplay.
Moreover, the project enabled a first, investigative step for sharing Swedish benchmark datasets. This is key for evaluating the quality of NLP based tools, with regards to parameters such as functionality and bias. The benchmark data is now being developed in the SuperLim project.
The insights and knowledge generated from the project will be available at the Knowledge Hub and distributed by project partners. For instance, the NER model and sentiment analysis models are available for use in new applications, and public sector organizations will have access to learnings and inspiration on how to start applying NLP in their work (see 'Resources' below).
Check out our YouTube channel to watch the recorded webinar!
Introduction - Vanja Carlén (AI Sweden)
NER model and sentiment analysis models on fear and violence - Simon Westlund, Danila Petrelli, Fredrik Möller (Recorded Future)
Dialogue perspective - José Miguel Cano Santín, Fredrik Garneij (Talkamatic)
Use cases - Fredrik Carlsson, Frédéric Rambaud (SKR)
Sustainable language models - Peter Ljunglöf (Språkbanken), Jussi Karlgren (Gavagai/Spotify)
Legal insights when sharing data - Josefine Rembsgård (AI Sweden)
Panel discussion on the future of sharing Swedish text data - Staffan Truvé (Recorded Future), Peter Ljunglöf (Språkbanken), Jussi Karlgren (Gavagai/Spotify), and Love Börjeson (AI Sweden/National Library of Sweden), Josefine Rembsgård (AI Sweden)
Summary & key takeaways
AI Sweden will shortly publish a whitepaper on the legal learnings gained in the project.
The NER model and the sentiment analysis models detecting fear and violence are available on Recorded Future's huggingface, here. You can also read a blog post by Recorded Future on the sentiment analysis models, here.
The first version of the SuperLim benchmark data resources is available through Språkbanken Text, here.
The Swedish NLP webinar series is on a summer break, but you can watch the recorded webinars in the Youtube playlist, here.
Read more about the collaboration pilot on federated learning by the National Library of Sweden and Scaleout Systems, here.
Want to know more?