Swedish Medical Language Data Lab
Natural Language Processing (NLP) has the potential of providing insights from the large amounts of medical text data that is generated daily in the Swedish healthcare and life science sector. Patterns, insights, and concepts that today seem to be hidden in this unstructured data, could be brought to light by applying medical NLP. By searching, analysing, and interpreting the data sets, these insights can give us deeper understanding of healthcare quality, improving methods, and better results for patients.
The Swedish Medical Language Data Lab aims to develop medical language models, and to increase the accessibility to medical data; in accordance with the ethical, legal, and technical processes that are required to manage sensitive medical data in text form. The models are being developed for application within healthcare, dental care and data-driven research and innovation within the life science sector.
The initial use cases in the project are provided by the stakeholders and data owners in the project. The clear focus on the need, the end-users, and on personal integrity, constitute the guiding criterias for the project.
The establishment of a national knowledge node in the field of NLP began with the Swedish Language Data Lab. Models, methods, and learnings that are specific to the medical domain will be shared during the course of this project.
The project is coordinated by Sahlgrenska Science Park. Peltarion, RISE (Research Institutes of Sweden) and AI Sweden are the expert resources for the project. The stakeholders and data owners are Region Halland, Folktandvården Västra Götalandsregionen and Sahlgrenska University Hospital.
The initial interest group consists of Sahlgrenska Academy at the University of Gothenburg, Essity Hygiene and Health AB, and Språkbanken at the University of Gothenburg. A reference group consisting of organisations from the partners’ networks will also be set up.
AI Sweden is responsible for investigating and developing the procedures for making data and models available and for distribution of the results.