Skip to main content
conference lunch move company map contacts lindholmen lindholmen 2 travel info

Summing up the Swedish Language Data Lab

Language models, use cases and balancing the technical and legal aspects - these were some of the main topics when wrapping up the first project initiated as part of AI Sweden’s strategic program Applied Language Technology. You can watch the recording of the webinar below or on our Youtube channel!

The webinar was organized to share results and learnings from the project, where AI experts, academia and the public sector in close collaboration have developed language models, use cases, and data sharing insights. The results contribute to the development and application of Swedish NLP and have laid the foundation for AI Sweden’s continued strategic program in applied language technology. Swedish is a relatively small language, and collaborations that contribute to the development specifically in Swedish is therefore key for the accelerated application of AI nationwide.

The overall goal of the Swedish Language Data Lab was to create a national knowledge hub within applied language technology, and thereby accelerate innovation, research, and applications in this area. The project was part of Vinnova’s Data-driven innovation funding programme which aims to increase the level of expertise in reusing data in innovations in Sweden.

During the panel discussion, the importance of the legal aspects of NLP was highlighted, as well as the increased understanding of it as a result of the project. It also became clear that there is a need for guidance and clarification on how language technology and the GDPR should interplay.

Moreover, the project enabled a first, investigative step for sharing Swedish benchmark datasets. This is key for evaluating the quality of NLP based tools, with regards to parameters such as functionality and bias. The benchmark data is now being developed in the SuperLim project. 

The insights and knowledge generated from the project will be available at the Knowledge Hub and distributed by project partners. For instance, the NER model and sentiment analysis models are available for use in new applications, and public sector organizations will have access to learnings and inspiration on how to start applying NLP in their work (see 'Resources' below).

Check out our YouTube channel to watch the recorded webinar!

Agenda

Introduction - Vanja Carlén (AI Sweden)
NER model and sentiment analysis models on fear and violence - Simon Westlund, Danila Petrelli, Fredrik Möller (Recorded Future)
Dialogue perspective - José Miguel Cano Santín, Fredrik Garneij (Talkamatic)
Use cases - Fredrik Carlsson, Frédéric Rambaud (SKR)
Sustainable language models - Peter Ljunglöf (Språkbanken), Jussi Karlgren (Gavagai/Spotify)
Legal insights when sharing data - Josefine Rembsgård (AI Sweden)
Panel discussion on the future of sharing Swedish text data  - Staffan Truvé (Recorded Future), Peter Ljunglöf (Språkbanken), Jussi Karlgren (Gavagai/Spotify), and Love Börjeson (AI Sweden/National Library of Sweden), Josefine Rembsgård (AI Sweden)
Summary & key takeaways

Resources

  • AI Sweden will shortly publish a whitepaper on the legal learnings gained in the project. 

  • The NER model and the sentiment analysis models detecting fear and violence are available on Recorded Future's huggingface, here. You can also read a blog post by Recorded Future on the sentiment analysis models, here.

  • Watch a longer presentation of the SKR use case from “Offentliga rummet”, here, and access the KB-BERT model that they will train for their use case, here

  • The first version of the SuperLim benchmark data resources is available through Språkbanken Text, here

  • The Swedish NLP webinar series is on a summer break, but you can watch the recorded webinars in the Youtube playlist, here

  • Read more about the collaboration pilot on federated learning by the National Library of Sweden and Scaleout Systems, here.

  • Learn more about two of the prototypes that Talkamatic has developed after connecting through the Swedish Language Data Lab, here (with Helsingborgs Stad) and here (with University of Örebro)

Want to know more? 

Vanja Carlén
Project manager

+46 (0)723-96 34 05
vanja.carlen@ai.se