Skip to main content
conference lunch move company map contacts lindholmen lindholmen 2 travel info


Swedish Medical Language Data Lab

An extensive amount of medical text data is produced in Swedish healthcare, dental care, and the life science sector. The Swedish Medical Language Data Lab aims to utilize this data to develop language models for the medical domain.


The Swedish Medical Language Data Lab aims to produce and make available medical language data sets and language models in order to be able to develop and apply language models that are designed for Swedish medical terminology. The models will then be used in health care, dental care and data-driven research and innovation within the life science sector.

The establishment of a national knowledge hub in the field of NLP began with the Swedish Language Data Lab. Data and knowledge that is specific to the medical domain will be added to the hub during the course of this project. The project is the result of a preliminary study carried out during the autumn of 2019 as part of the Swedish Language Data Lab.

The focus of the project is on the users and on personal integrity, with the aim of developing language models that will provide guidance during the creation of structures for the technical, legal and ethical processes required to manage sensitive medical data in text form.

AI Innovation of Sweden is responsible for investigating and developing the procedures for making data and models available and for distributing the results.

The project is being coordinated by Sahlgrenska Science Park. Peltarion, RISE (Research Institutes of Sweden) and AI Innovation of Sweden are the expert resources for the project. The stakeholders and data owners are Region Halland, the Swedish Public Dental Service in Region Västra Götaland and Sahlgrenska University Hospital. The initial interest group consists of Sahlgrenska Academy at the University of Gothenburg, Essity Hygiene and Health AB and Språkbanken at the University of Gothenburg. A reference group consisting of organisations from the partners’ networks will also be set up.


Sahlgrenska Science Park described the project as “...the first building block for future AI applications using written Swedish medical language” [7]. By making use of the large volume of medical data available nationally in a reliable way, the project can help to increase the efficiency and quality of Swedish healthcare. In the long term, this will lead to shorter treatment times and lower care costs, but also to less suffering and a better quality of life for patients.

Two examples of applications for medical NLP are identifying healthcare injuries in medical text and enabling computers rather than people to read medical text.

Status (March 2020)

The project is still in its start-up phase. AI Innovation of Sweden has started to develop proposals for the activities that will form part of the work packages (WP) we are involved with. We will also share the lessons that we have learned from the legal work already carried out during the start-up of the Swedish Language Data Lab.

Future plans

The project will be completed in December 2021. 

WP2: In-depth requirements specification and analysis of needs

WP3: Inventory and evaluation of existing data and models ( participating)

WP4: Technical adaptation of data

WP5: Design of models

WP6: Sharing and distribution of data ( responsible)

WP7: Distribution of results to ensure appropriate use ( participating)


Related activities

One examples of related newly published advances in medical NLP is:

  • ePsykiatrienheten (the e-psychiatry unit at Sahlgrenska University Hospital) project that aims to “...predict which patients in inpatient care at the dependency clinic will be readmitted soon after being discharged” [8][9].


Project partnersSahlgrenska Science ParkRegion Halland, Folktandvården Västra Götalandsregionen, Sahlgrenska Universitetssjukhuset, Peltarion, RISE

The project is coordinated by Sahlgrenska Science Park. 

Project period: 2019-12-01 - 2021-12-31