Hoppa till huvudinnehåll

Vetenskapliga publikationer

Photo of a large amounts of text documents laying on a table, seen from above. AI-generated

Inom många områden, bland annat federated learning och NLU, ligger AI Swedens arbete i den absoluta frontlinjen. En del av arbetet som sker tillsammans med våra partners mynnar ut i vetenskapliga artiklar. De publikationerna finns samlade här.

Publikationerna är uppdelade i två huvudkategorier. Klicka på knapparna nedan för att navigera direkt till relevant sektion.

Utvalda publikationer

Text: Distributional Legacy: The Unreasonable Effectiveness of Harris’s Distributional Program, Magnus Sahlgren and picture of Magnus Sahlgren

Distributional Legacy: The Unreasonable Effectiveness of Harris’s Distributional Program

Till publikation
Screenshot showing the first page of the article named: Decoupled Subgraph Federated Learning

Decoupled Subgraph Federated Learning

Javad Aliakbari, Johan Östman and Alexandre Graell i Amat. 2024. FedStruct: Federated Decoupled Learning over Interconnected Graphs

Till publikation
Book cover: Distributional semantics 1x1

Distributional Semantics

Alessandro Lenci and Magnus Sahlgren. 2023. Distributional semantics. Cambridge University Press.


 

Till publikation
Screenshot showing the first page of the article named: Publishing Neural Networks in Drug Discovery Might Compromise Training Data Privacy

Publishing Neural Networks in Drug Discovery Might Compromise Training Data Privacy

 Fabian P. Krüger, Johan Östman, Lewis Mervin, Igor V. Tetko and Ola Engkvist. Submitted to Nature Machine Intelligence 2024.

Till publikation

FEDERATED LEARNING & SECURITY

Isaksson, M., Listo Zec, E., Cöster, R., Gillblad, D., Girdzijauskas, S. (2023). Adaptive Expert Models for Federated Learning. In: Goebel, R., Yu, H., Faltings, B., Fan, L., Xiong, Z. (eds) Trustworthy Federated Learning. FL 2022. Lecture Notes in Computer Science(), vol 13448. Springer, Cham. https://doi.org/10.1007/978-3-031-28996-5_1

Abstract

Federated Learning (FL) is a promising framework for distributed learning when data is private and sensitive. However, the state-of-the-art solutions in this framework are not optimal when data is heterogeneous and non-IID. We propose a practical and robust approach to personalization in FL that adjusts to heterogeneous and non-IID data by balancing exploration and exploitation of several global models. To achieve our aim of personalization, we use a Mixture of Experts (MoE) that learns to group clients that are similar to each other, while using the global models more efficiently. We show that our approach achieves an accuracy up to 29.78% better than the state-of-the-art and up to 4.38% better compared to a local model in a pathological non-IID setting, even though we tune our approach in the IID setting.

J. Martinsson, E. L. Zec, D. Gillblad and O. Mogren, "Adversarial representation learning for synthetic replacement of private attributes," 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 2021, pp. 1291-1299, doi: 10.1109/BigData52589.2021.9671802.

Abstract

Data privacy is an increasingly important aspect of many real-world Data sources that contain sensitive information may have immense potential which could be unlocked using the right privacy enhancing transformations, but current methods often fail to produce convincing output. Furthermore, finding the right balance between privacy and utility is often a tricky trade-off. In this work, we propose a novel approach for data privatization, which involves two steps: in the first step, it removes the sensitive information, and in the second step, it replaces this information with an independent random sample. Our method builds on adversarial representation learning which ensures strong privacy by training the model to fool an increasingly strong adversary. While previous methods only aim at obfuscating the sensitive information, we find that adding new random information in its place strengthens the provided privacy and provides better utility at any given level of privacy. The result is an approach that can provide stronger privatization on image data, and yet be preserving both the domain and the utility of the inputs, entirely independent of the downstream task.

Carrasco Limeros, S. et al. (2023). Assessing GAN-Based Generative Modeling on Skin Lesions Images. In: Biele, C., Kacprzyk, J., Kopeć, W., Owsiński, J.W., Romanowski, A., Sikorski, M. (eds) Digital Interaction and Machine Intelligence. MIDI 2022. Lecture Notes in Networks and Systems, vol 710. Springer, Cham. https://doi.org/10.1007/978-3-031-37649-8_10

Abstract

We explored unconditional and conditional Generative Adversarial Networks (GANs) in centralized and decentralized settings. The centralized setting imitates studies on large but highly unbalanced skin lesion dataset, while the decentralized one simulates a more realistic hospital scenario with three institutions. We evaluated models’ performance in terms of fidelity, diversity, speed of training, and predictive ability of classifiers trained on the generated synthetic data. In addition, we provided explainability focused on both global and local features. Calculated distance between real images and their projections in the latent space proved the authenticity of generated samples, which is one of the main concerns in this type of applications. The code for studies is publicly available (https://github.com/aidotse/stylegan2-ada-pytorch).

Rickard Brännvall, Helena Linge, Johan Östman. 2023. Can the use of privacy enhancing technologies enable federated learning for health data applications in a Swedish regulatory context? 35th Annual Workshop of the Swedish Artificial Intelligence Society SAIS 2023

Abstract

A recent report by the Swedish Authority for PrivacyProtection (IMY) evaluates the potential of jointly training and ex-changing machine learning models between two healthcare providers.In relation to the privacy problems identified therein, this articleexplores the trade-off between utility and privacy when using privacy-enhancing technologies (PETs) in combination with federated learn-ing. Results are reported from numerical experiments with standardtext-book machine learning models under both differential privacy(DP) and Fully Homomorphic Encryption (FHE). The results indicatethat FHE is a promising approach for privacy-preserving federatedlearning, with the CKKS scheme being more favorable in terms ofcomputational performance due to its support of SIMD operationsand compact representation of encrypted vectors. The results for DPare more inconclusive. The article briefly discusses the current reg-ulatory context and aspects that lawmakers may consider to enablean AI leap in Swedish healthcare while maintaining data protection.

Aleksis Pirinen, Nosheen Abid, Nuria Agues Paszkowsky, Thomas Ohlson Timoudas, Ronald Scheirer, Chiara Ceccobello, György Kovács, Anders Persson, Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI (preprint)

Abstract

Cloud formations often obscure optical satellite-based monitoring of the Earth's surface, thus limiting Earth observation (EO) activities such as land cover mapping, ocean color analysis, and cropland monitoring. The integration of machine learning (ML) methods within the remote sensing domain has significantly improved performance on a wide range of EO tasks, including cloud detection and filtering, but there is still much room for improvement. A key bottleneck is that ML methods typically depend on large amounts of annotated data for training, which is often difficult to come by in EO contexts. This is especially true when it comes to cloud optical thickness (COT) estimation. A reliable estimation of COT enables more fine-grained and application-dependent control compared to using pre-specified cloud categories, as is commonly done in practice. To alleviate the COT data scarcity problem, in this work we propose a novel synthetic dataset for COT estimation, that we subsequently leverage for obtaining reliable and versatile cloud masks on real data. In our dataset, top-of-atmosphere radiances have been simulated for 12 of the spectral bands of the Multispectral Imagery (MSI) sensor onboard Sentinel-2 platforms. These data points have been simulated under consideration of different cloud types, COTs, and ground surface and atmospheric profiles. Extensive experimentation of training several ML models to predict COT from the measured reflectivity of the spectral bands demonstrates the usefulness of our proposed dataset. In particular, by thresholding COT estimates from our ML models, we show on two satellite image datasets (one that is publicly available, and one which we have collected and annotated) that reliable cloud masks can be obtained. The synthetic data, the collected real dataset, code and models have been made publicly available at this https URL.

Decentralised semi-supervised onboard learning for scene classification in low-earth orbit”. J Östman, P Gomez, VM Shreenath, G Meoni. Proceedings of the 14th IAA Symposium on Small Satellites for Earth Observation, SSEO. 2023

Abstract

Onboard machine learning on the latest satellite hardware offers the potential for significant savings in communication and operational costs. We showcase the training of a machine learning model on a satellite constellation for scene classification using semi-supervised learning while accounting for operational constraints such as temperature and limited power budgets based on satellite processor benchmarks of the neural network. We evaluate mission scenarios employing both decentralised and federated learning approaches. All scenarios achieve convergence to high accuracy (around 91% on EuroSAT RGB dataset) within a one-day mission timeframe.

Johan Östman, Ather Gattami, and Daniel Gillblad. 2023. Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds. arXiv preprint arXiv:2301.11802. (preprint) https://arxiv.org/pdf/2301.11802.pdf

Abstract

We consider a decentralized multiplayer game, played over T rounds, with a leader-follower hierarchy described by a directed acyclic graph. For each round, the graph structure dictates the order of the players and how players observe the actions of one another. By the end of each round, all players receive a joint bandit-reward based on their joint action that is used to update the player strategies towards the goal of minimizing the joint pseudo-regret. We present a learning algorithm inspired by the single-player multi-armed bandit problem and show that it achieves sub-linear joint pseudo-regret in the number of rounds for both adversarial and stochastic bandit rewards. Furthermore, we quantify the cost incurred due to the decentralized nature of our problem compared to the centralized setting.

Efficient Node Selection in Private Personalized Decentralized Learning”, EL Zec, J Östman, O Mogren, D Gillblad - Northern Lights Deep Learning Conference, 2024

Abstract

Personalized decentralized learning is a promising paradigm for distributed learning, enabling each node to train a local model on its own data and collaborate with other nodes to improve without sharing any data. However, this approach poses significant privacy risks, as nodes may inadvertently disclose sensitive information about their data or preferences through their collaboration choices. In this paper, we propose Private Personalized Decentralized Learning (PPDL), a novel approach that combines secure aggregation and correlated adversarial multi-armed bandit optimization to protect node privacy while facilitating efficient node selection. By leveraging dependencies between different arms, represented by potential collaborators, we demonstrate that PPDL can effectively identify suitable collaborators solely based on aggregated models. Additionally, we show that PPDL surpasses previous non-private methods in model performance on standard benchmarks under label and covariate shift scenarios.

FedGT: Identification of Malicious Clients in Federated Learning with Secure Aggregation”, M Xhemrishi, J Östman, A Wachter-Zeh, AG i Amat - arXiv preprint arXiv:2305.05506, 2023

Abstract

We propose FedGT, a novel framework for identifying malicious clients in federated learning with secure aggregation. Inspired by group testing, the framework leverages overlapping groups of clients to identify the presence of malicious clients in the groups via a decoding operation. The clients identified as malicious are then removed from the model training, which is performed over the remaining clients. By choosing the size, number, and overlap between groups, FedGT strikes a balance between privacy and security. Specifically, the server learns the aggregated model of the clients in each group - vanilla federated learning and secure aggregation correspond to the extreme cases of FedGT with group size equal to one and the total number of clients, respectively. The effectiveness of FedGT is demonstrated through extensive experiments on the MNIST, CIFAR-10, and ISIC2019 datasets in a cross-silo setting under different data-poisoning attacks. These experiments showcase FedGT's ability to identify malicious clients, resulting in high model utility. We further show that FedGT significantly outperforms the private robust aggregation approach based on the geometric median recently proposed by Pillutla et al. in multiple settings.

FedStruct: Federated Decoupled Learning over Interconnected Graphs”, J Aliakbari, J Östman, AG i Amat - arXiv preprint arXiv:2402.19163, 2024

Abstract

We address the challenge of federated learning on graph-structured data distributed across multiple clients. Specifically, we focus on the prevalent scenario of interconnected subgraphs, where interconnections between different clients play a critical role. We present a novel framework for this scenario, named FedStruct, that harnesses deep structural dependencies. To uphold privacy, unlike existing methods, FedStruct eliminates the necessity of sharing or generating sensitive node features or embeddings among clients. Instead, it leverages explicit global graph structure information to capture inter-node dependencies. We validate the effectiveness of FedStruct through experimental results conducted on six datasets for semi-supervised node classification, showcasing performance close to the centralized approach across various scenarios, including different data partitioning methods, varying levels of label availability, and number of clients.

VALADI, Viktor, et al. {FedVal}: Different good or different bad in federated learning. In: 32nd USENIX Security Symposium (USENIX Security 23). 2023.

Abstract

Federated learning (FL) systems are susceptible to attacks from malicious actors who might attempt to corrupt the training model through various poisoning attacks. FL also poses new challenges in addressing group bias, such as ensuring fair performance for different demographic groups. Traditional methods used to address such biases require centralized access to the data, which FL systems do not have. In this paper, we present a novel approach FedVal for both robustness and fairness that does not require any additional information from clients that could raise privacy concerns and consequently compromise the integrity of the FL system. To this end, we propose an innovative score function based on a server-side validation method that assesses client updates and determines the optimal aggregation balance between locally-trained models. Our research shows that this approach not only provides solid protection against poisoning attacks but can also be used to reduce group bias and subsequently promote fairness while maintaining the system's capability for differential privacy. Extensive experiments on the CIFAR-10, FEMNIST, and PUMS ACSIncome datasets in different configurations demonstrate the effectiveness of our method, resulting in state-of-the-art performances. We have proven robustness in situations where 80% of participating clients are malicious. Additionally, we have shown a significant increase in accuracy for underrepresented labels from 32% to 53%, and increase in recall rate for underrepresented features from 19% to 50%.

On Local Mutual-Information Privacy”, Khac-Hoang Ngo, Johan Östman, Alexandre Graell i Amat. in ITW, 2024.

Abstract

Local mutual-information privacy (LMIP) is a privacy notion that aims to quantify the reduction of uncertainty about the input data when the output of a privacy-preserving mechanism is revealed. We study the relation of LMIP with local differential privacy (LDP —the de facto standard notion of privacy in context-independent scenarios—, and with local information privacy (LIP)—the state-of-the-art notion for contextdependent settings. We establish explicit conversion rules, i.e., bounds on the privacy parameters for a LMIP mechanism to also satisfy LDP/LIP, and vice versa. We use our bounds to formally verify that LMIP is a weak privacy notion. We also show that uncorrelated Gaussian noise is the best-case noise in terms of context-independent LMIP if both the input data and the noise are subject to an average power constraint.

PAseos Simulates the Environment for Operating multiple Spacecraft”, P Gómez, J Östman, VM Shreenath, G Meoni.  IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS), 2024

Abstract

The next generation of spacecraft technology is anticipated to enable novel applications, including onboard processing, machine learning, and decentralized operational scenarios. Although several of these applications have been previously investigated, the real-world operational limitations associated with actual mission scenarios have been only superficially addressed. Here, we present an open-source Python module called PASEOS, capable of modeling operational scenarios involving one or multiple spacecraft. It considers several physical phenomena, including thermal, power, bandwidth, and communications constraints, and the impact of radiation on spacecraft. PASEOS can be run as a high-performance-oriented numerical simulation and/or in a real-time mode on edge hardware. We demonstrate these capabilities in three scenarios: one in real-time simulation on a Unibap iX-10 100 satellite processor, another in a simulation modeling an entire constellation performing tasks over several hours, and one training a machine learning model in a decentralized setting. While we demonstrate tasks in Earth orbit, PASEOS also allows deep space scenarios. Our results show that PASEOS can model the described scenarios efficiently and thus provide insight into operational considerations. We show this by measuring runtime and overhead as well as by investigating the constellation's modeled temperature, battery status, and communication windows. By running PASEOS on an actual satellite processor, we showcase how PASEOS can be directly included in hardware demonstrators for future missions. Overall, we provide the first solution to holistically model the physical constraints spacecraft encounter in space. The PASEOS module is available online with extensive documentation, enabling researchers to incorporate it into their studies quickly.

“Poisoning Attacks on Federated Learning for Autonomous Driving”, Sonakshi Garg, Hugo Jönsson, Gustav Kalander, Axel Nilsson, Bhhaanu Pirange, Viktor Valadi, Johan Östman. In: SCAI 2024.

Abstract

Federated Learning (FL) is a decentralized learning paradigm, enabling parties to collaboratively train models while keeping their data confidential. Within autonomous driving, it brings the potential of reducing data storage costs, reducing bandwidth requirements, and to accelerate the learning. FL is, however, susceptible to poisoning attacks. In this paper, we introduce two novel poisoning attacks on FL tailored to regression tasks within autonomous driving: FLStealth and Off-Track Attack (OTA). FLStealth, an untargeted attack, aims at providing model updates that deteriorate the global model performance while appearing benign. OTA, on the other hand, is a targeted attack with the objective to change the global model's behavior when exposed to a certain trigger. We demonstrate the effectiveness of our attacks by conducting comprehensive experiments pertaining to the task of vehicle trajectory prediction. In particular, we show that, among five different untargeted attacks, FLStealth is the most successful at bypassing the considered defenses employed by the server.For OTA, we demonstrate the inability of common defense strategies to mitigate the attack, highlighting the critical need for new defensive mechanisms against targeted attacks within FL for autonomous driving.

"Publishing Neural Networks in Drug Discovery Might Compromise Training Data Privacy”, Fabian Krüger, Johan Östman, Lewis Mervin, Igor V. Tetko, and Ola Engkvist. Submitted to Nature Machine Intelligence 2024.

Abstract

This study investigates the risks of exposing confidential chemical structures when machine learning models trained on these structures are made publicly available. We use membership inference attacks, a common method to assess privacy that is largely unexplored in the context of drug discovery, to examine neural networks for molecular property prediction in a black-box setting. Our results reveal significant privacy risks across all evaluated datasets and neural network architectures. Combining multiple attacks increases these risks. Molecules from minority classes, often the most valuable in drug discovery, are particularly vulnerable. We also found that representing molecules as graphs and using message-passing neural networks may mitigate these risks. We provide a framework to assess privacy risks of classification models and molecular representations. Our findings highlight the need for careful consideration when sharing neural networks trained on proprietary chemical structures, informing organisations and researchers about the trade-offs between data confidentiality and model openness.

Plumridge, M., Maråk, R., Ceccobello, C., Gómez, P., Meoni, G., Svoboda, F., & Lane, N. D. (2024). Rapid Distributed Fine-tuning of a Segmentation Model Onboard Satellites. ArXiv. https://arxiv.org/abs/2411.17831

Abstract

Segmentation of Earth observation (EO) satellite data is critical for natural hazard analysis and disaster response. However, processing EO data at ground stations introduces delays due to data transmission bottlenecks and communication windows. Using segmentation models capable of near-real-time data analysis onboard satellites can therefore improve response times. This study presents a proof-of-concept using MobileSAM, a lightweight, pre-trained segmentation model, onboard Unibap iX10-100 satellite hardware. We demonstrate the segmentation of water bodies from Sentinel-2 satellite imagery and integrate MobileSAM with PASEOS, an open-source Python module that simulates satellite operations. This integration allows us to evaluate MobileSAM's performance under simulated conditions of a satellite constellation. Our research investigates the potential of fine-tuning MobileSAM in a decentralised way onboard multiple satellites in rapid response to a disaster. Our findings show that MobileSAM can be rapidly fine-tuned and benefits from decentralised learning, considering the constraints imposed by the simulated orbital environment. We observe improvements in segmentation performance with minimal training data and fast fine-tuning when satellites frequently communicate model updates. This study contributes to the field of onboard AI by emphasising the benefits of decentralised learning and fine-tuning pre-trained models for rapid response scenarios. Our work builds on recent related research at a critical time; as extreme weather events increase in frequency and magnitude, rapid response with onboard data analysis is essential.

Ngo, Khac-Hoang, Johan Östman, Giuseppe Durisi, and Alexandre Graell i Amat. "Secure Aggregation is Not Private Against Membership Inference Attacks." in ECML PKDD (2024).

Abstract

Secure aggregation (SecAgg) is a commonly-used privacy-enhancing mechanism in federated learning, affording the server access only to the aggregate of model updates while safeguarding the confidentiality of individual updates. Despite widespread claims regarding SecAgg’s privacy-preserving capabilities, a formal analysis of its privacy is lacking, making such presumptions unjustified. In this paper, we delve into the privacy implications of SecAgg by treating it as a local differential privacy (LDP) mechanism for each local update. We design a simple attack wherein an adversarial server seeks to discern which update vector a client submitted, out of two possible ones, in a single training round of federated learning under SecAgg. By conducting privacy auditing, we assess the success probability of this attack and quantify the LDP guarantees provided by SecAgg. Our numerical results unveil that, contrary to prevailing claims, SecAgg offers weak privacy against membership inference attacks even in a single training round. Indeed, it is difficult to hide a local update by adding other independent local updates when the updates are of high dimension. Our findings underscore the imperative for additional privacy-enhancing mechanisms, such as noise injection, in federated learning.

Tackling the Satellite Downlink Bottleneck with Federated Onboard Learning of Image Compression”, Pablo Gomez and Gabrielle Meoni. In: CVPR Workshop 2024.

Abstract

Satellite data transmission is a crucial bottleneck for Earth observation applications. To overcome this problem, we propose a novel solution that trains a neural network on board multiple satellites to compress raw data and only send down heavily compressed previews of the images while retaining the possibility of sending down selected losslessly compressed data. The neural network learns to encode and decode the data in an unsupervised fashion using distributed machine learning. By simulating and optimizing the learning process under realistic constraints such as thermal, power and communication limitations, we demonstrate the feasibility and effectiveness of our approach. For this, we model a constellation of three satellites in a Sunsynchronous orbit. We use real raw, multispectral data from Sentinel-2 and demonstrate the feasibility on spaceproven hardware for the training. Our compression method outperforms JPEG compression on different image metrics, achieving better compression ratios and image quality. We report key performance indicators of our method, such as image quality, compression ratio and benchmark training time on a Unibap iX10-100 processor. Our method has the potential to significantly increase the amount of satellite data collected that would typically be discarded (e.g., over oceans) and can potentially be extended to other applications even outside Earth observation. All code and data of the method are available online to enable rapid application of this approach.

Mikołajczyk, A., Majchrowska, S., Carrasco Limeros, S. (2022). The (de)biasing Effect of GAN-Based Augmentation Methods on Skin Lesion Images. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13438. Springer, Cham.

Abstract

New medical datasets are now more open to the public, allowing for better and more extensive research. Although prepared with the utmost care, new datasets might still be a source of spurious correlations that affect the learning process. Moreover, data collections are usually not large enough and are often unbalanced. One approach to alleviate the data imbalance is using data augmentation with Generative Adversarial Networks (GANs) to extend the dataset with high-quality images. GANs are usually trained on the same biased datasets as the target data, resulting in more biased instances. This work explored unconditional and conditional GANs to compare their bias inheritance and how the synthetic data influenced the models. We provided extensive manual data annotation of possibly biasing artifacts on the well-known ISIC dataset with skin lesions. In addition, we examined classification models trained on both real and synthetic data with counterfactual bias explanations. Our experiments showed that GANs inherited biases and sometimes even amplified them, leading to even stronger spurious correlations. Manual data annotation and synthetic images are publicly available for reproducible scientific research.

Limeros, S. C., Majchrowska, S., Johnander, J., Petersson, C., Sotelo, M. Á., & Llorca, D. F. (2024). Towards trustworthy multi-modal motion prediction: Holistic evaluation and interpretability of outputs. CAAI Transactions on Intelligence Technology, 9(3), 557-572. https://doi.org/10.1016/j.trc.2023.104405

Abstract

Predicting the motion of other road agents enables autonomous vehicles to perform safe and efficient path planning. This task is very complex, as the behaviour of road agents depends on many factors and the number of possible future trajectories can be considerable (multi-modal). Most prior approaches proposed to address multi-modal motion prediction are based on complex machine learning systems that have limited interpretability. Moreover, the metrics used in current benchmarks do not evaluate all aspects of the problem, such as the diversity and admissibility of the output. The authors aim to advance towards the design of trustworthy motion prediction systems, based on some of the requirements for the design of Trustworthy Artificial Intelligence. The focus is on evaluation criteria, robustness, and interpretability of outputs. First, the evaluation metrics are comprehensively analysed, the main gaps of current benchmarks are identified, and a new holistic evaluation framework is proposed. Then, a method for the assessment of spatial and temporal robustness is introduced by simulating noise in the perception system. To enhance the interpretability of the outputs and generate more balanced results in the proposed evaluation framework, an intent prediction layer that can be attached to multi-modal motion prediction models is proposed. The effectiveness of this approach is assessed through a survey that explores different elements in the visualisation of the multi-modal trajectories and intentions. The proposed approach and findings make a significant contribution to the development of trustworthy motion prediction systems for autonomous vehicles, advancing the field towards greater safety and reliability.

Limeros, S. C., Majchrowska, S., Johnander, J., Petersson, C., & Llorca, D. F. (2022). Towards Explainable Motion Prediction using Heterogeneous Graph Representations. ArXiv. https://arxiv.org/abs/2212.03806

Abstract

Motion prediction systems play a crucial role in enabling autonomous vehicles to navigate safely and efficiently in complex traffic scenarios. Graph Neural Network (GNN)-based approaches have emerged as a promising solution for capturing interactions among dynamic agents and static objects. However, they often lack transparency, interpretability and explainability — qualities that are essential for building trust in autonomous driving systems. In this work, we address this challenge by presenting a comprehensive approach to enhance the explainability of graph-based motion prediction systems. We introduce the Explainable Heterogeneous Graph-based Policy (XHGP) model based on an heterogeneous graph representation of the traffic scene and lane-graph traversals. Distinct from other graph-based models, XHGP leverages object-level and type-level attention mechanisms to learn interaction behaviors, providing information about the importance of agents and interactions in the scene. In addition, capitalizing on XHGP’s architecture, we investigate the explanations provided by the GNNExplainer and apply counterfactual reasoning to analyze the sensitivity of the model to modifications of the input data. This includes masking scene elements, altering trajectories, and adding or removing dynamic agents. Our proposal advances towards achieving reliable and explainable motion prediction systems, addressing the concerns of users, developers and regulatory agencies alike. The insights gained from our explainability analysis contribute to a better understanding of the relationships between dynamic and static elements in traffic scenarios, facilitating the interpretation of the results, as well as the correction of possible errors in motion prediction models, and thus contributing to the development of trustworthy motion prediction systems.
The code to reproduce this work is publicly available at https://github.com/sancarlim/Explainable-MP/tree/v1.1.

Majchrowska, S., Hildeman, A., Teare, P., & Diethe, T. (2023). Unlocking the Heart Using Adaptive Locked Agnostic Networks. ArXiv. https://arxiv.org/abs/2309.11899

Abstract

Supervised training of deep learning models for medical imaging applications requires a significant amount of labeled data. This is posing a challenge as the images are required to be annotated by medical professionals. To address this limitation, we introduce the Adaptive Locked Agnostic Network (ALAN), a concept involving self-supervised visual feature extraction using a large backbone model to produce anatomically robust semantic self-segmentation. In the ALAN methodology, this self-supervised training occurs only once on a large and diverse dataset. Due to the intuitive interpretability of the segmentation, downstream models tailored for specific tasks can be easily designed using white-box models with few parameters. This, in turn, opens up the possibility of communicating the inner workings of a model with domain experts and introducing prior knowledge into it. It also means that the downstream models become less data-hungry compared to fully supervised approaches. These characteristics make ALAN particularly well-suited for resource-scarce scenarios, such as costly clinical trials and rare diseases. In this paper, we apply the ALAN approach to three publicly available echocardiography datasets: EchoNet-Dynamic, CAMUS, and TMED-2. Our findings demonstrate that the self-supervised backbone model robustly identifies anatomical subregions of the heart in an apical four-chamber view. Building upon this, we design two downstream models, one for segmenting a target anatomical region, and a second for echocardiogram view classification.

Yacob, F., Siarov, J., Villiamsson, K., Suvilehto, J. T., Sjöblom, L., Kjellberg, M., & Neittaanmäki, N. (2023). Weakly supervised detection and classification of basal cell carcinoma using graph-transformer on whole slide images. Scientific Reports, 13(1), 1-10. https://doi.org/10.1038/s41598-023-33863-z

Abstract

The high incidence rates of basal cell carcinoma (BCC) cause a significant burden at pathology laboratories. The standard diagnostic process is time-consuming and prone to inter-pathologist variability. Despite the application of deep learning approaches in grading of other cancer types, there is limited literature on the application of vision transformers to BCC on whole slide images (WSIs). A total of 1832 WSIs from 479 BCCs, divided into training and validation (1435 WSIs from 369 BCCs) and testing (397 WSIs from 110 BCCs) sets, were weakly annotated into four aggressivity subtypes. We used a combination of a graph neural network and vision transformer to (1) detect the presence of tumor (two classes), (2) classify the tumor into low and high-risk subtypes (three classes), and (3) classify four aggressivity subtypes (five classes). Using an ensemble model comprised of the models from cross-validation, accuracies of 93.5%, 86.4%, and 72% were achieved on two, three, and five class classifications, respectively. These results show high accuracy in both tumor detection and grading of BCCs. The use of automated WSI analysis could increase workflow efficiency.

 

NATURAL LANGUAGE UNDERSTANDING (NLU)

Lenci, A., Sahlgren, M., Jeuniaux, P. et al. A comparative evaluation and analysis of three generations of Distributional Semantic Models. Lang Resources & Evaluation 56, 1269–1313 (2022).

Abstract

Distributional semantics has deeply changed in the last decades. First, predict models stole the thunder from traditional count ones, and more recently both of them were replaced in many NLP applications by contextualized vectors produced by neural language models. Although an extensive body of research has been devoted to Distributional Semantic Model (DSM) evaluation, we still lack a thorough comparison with respect to tested models, semantic tasks, and benchmark datasets. Moreover, previous work has mostly focused on task-driven evaluation, instead of exploring the differences between the way models represent the lexical semantic space. In this paper, we perform a large-scale evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT. First of all, we investigate the performance of embeddings in several semantic tasks, carrying out an in-depth statistical analysis to identify the major factors influencing the behavior of DSMs. The results show that (i) the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous and (ii) static DSMs surpass BERT representations in most out-of-context semantic tasks and datasets. Furthermore, we borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models. RSA reveals important differences related to the frequency and part-of-speech of lexical items.

Fredrik Carlsson, Philipp Eisen, Faton Rekathati, and Magnus Sahlgren. 2022. Cross-lingual and Multilingual CLIP. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6848–6854, Marseille, France. European Language Resources Association.

Abstract

The long-standing endeavor of relating the textual and the visual domain recently underwent a pivotal breakthrough, as OpenAI released CLIP. This model distinguishes how well an English text corresponds with a given image with unprecedented accuracy. Trained via a contrastive learning objective over a huge dataset of 400M of images and captions, it is a work that is not easily replicated, especially for low resource languages. Capitalizing on the modularization of the CLIP architecture, we propose to use cross-lingual teacher learning to re-train the textual encoder for various non-English languages. Our method requires no image data and relies entirely on machine translation which removes the need for data in the target language. We find that our method can efficiently train a new textual encoder with relatively low computational cost, whilst still outperforming previous baselines on multilingual image-text retrieval.

Sahlgren, M. (2024). Distributional Legacy: The Unreasonable Effectiveness of Harris’s Distributional Program. WORD, 70(4), 246–257. https://www.tandfonline.com/doi/full/10.1080/00437956.2024.2414515

Abstract

This paper gives an overview of the influence that Zellig Harris’s paper “Distributional structure” has had on the research area of distributional semantics, a subfield of natural language processing. We trace the development of the distributional paradigm through three generations of distributional semantics models, arriving at the large language models that currently are at the forefront of public awareness on AI, and that constitute the driving force in the current AI trend. We touch upon the discussion whether the hype around large language models is warranted or not, and we argue that much of the current (philosophical) discussion around the epistemology of distributional models can be resolved by recalling the main arguments in “Distributional structure”.

Alessandro Lenci, Magnus Sahlgren. 2023. Distributional semantics. Cambridge University Press.

Book description

Distributional semantics develops theories and methods to represent the meaning of natural language expressions, with vectors encoding their statistical distribution in linguistic contexts. It is at once a theoretical model to express meaning, a practical methodology to construct semantic representations, a computational framework for acquiring meaning from language data, and a cognitive hypothesis about the role of language usage in shaping meaning. This book aims to build a common understanding of the theoretical and methodological foundations of distributional semantics. Beginning with its historical origins, the text exemplifies how the distributional approach is implemented in distributional semantic models. The main types of computational models, including modern deep learning ones, are described and evaluated, demonstrating how various types of semantic issues are addressed by those models. Open problems and challenges are also analyzed. Students and researchers in natural language processing, artificial intelligence, and cognitive science will appreciate this book.

Fredrik Carlsson, Magnus Sahlgren, Fredrik Olsson, and Amaru Cuba Gyllensten. 2021. GANDALF: a General Character Name Description Dataset for Long Fiction. In Proceedings of the 3rd Workshop on Machine Reading for Question Answering, pages 119–132, Punta Cana, Dominican Republic. Association for Computational Linguistics.

Abstract

This paper introduces a long-range multiple-choice Question Answering (QA) dataset, based on full-length fiction book texts. The questions are formulated as 10-way multiple-choice questions, where the task is to select the correct character name given a character description, or vice-versa. Each character description is formulated in natural text and often contains information from several sections throughout the book. We provide 20,000 questions created from 10,000 manually annotated descriptions of characters from 177 books containing 152,917 words on average. We address the current discourse regarding dataset bias and leakage by a simple anonymization procedure, which in turn enables interesting probing possibilities. Finally, we show that suitable baseline algorithms perform very poorly on this task, with the book size itself making it non-trivial to attempt a Transformer-based QA solution. This leaves ample room for future improvement, and hints at the need for a completely different type of solution.

Magnus Sahlgren, Fredrik Carlsson, Fredrik Olsson, and Love Börjeson. 2021. It’s Basically the Same Language Anyway: the Case for a Nordic Language Model. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pages 367–372, Reykjavik, Iceland (Online). Linköping University Electronic Press, Sweden.

Abstract

When is it beneficial for a research community to organize a broader collaborative effort on a topic, and when should we instead promote individual efforts? In this opinion piece, we argue that we are at a stage in the development of large-scale language models where a collaborative effort is desirable, despite the fact that the preconditions for making individual contributions have never been better. We consider a number of arguments for collaboratively developing a large-scale Nordic language model, include environmental considerations, cost, data availability, language typology, cultural similarity, and transparency. Our primary goal is to raise awareness and foster a discussion about our potential impact and responsibility as NLP community.

Ariel Ekgren, Amaru Cuba Gyllensten, Evangelia Gogoulou, Alice Heiman, Severine Verlinden, Joey Öhman, Fredrik Carlsson, and Magnus Sahlgren. 2022. Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3509–3518, Marseille, France. European Language Resources Association.

Abstract

We present GPT-SW3, a 3.5 billion parameter autoregressive language model, trained on a newly created 100 GB Swedish corpus. This paper provides insights with regards to data collection and training, while highlights the challenges of proper model evaluation. The results of quantitive evaluation through perplexity indicate that GPT-SW3 is a competent model in comparison with existing autoregressive models of similar size. Additionally, we perform an extensive prompting study which reveals the good text generation capabilities of GPT-SW3.

Felix Stollenwerk. 2023. nerblackbox: A High-level Library for Named Entity Recognition in Python. Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023).

Abstract

We present nerblackbox, a python library to facilitate the use of state-of-the-art transformerbased models for named entity recognition. It provides simple-to-use yet powerful methods to access data and models from a wide range of sources, for fully automated model training and evaluation as well as versatile model inference. While many technical challenges are solved and hidden from the user by default, nerblackbox also offers fine-grained control and a rich set of customizable features. It is thus targeted both at application-oriented developers as well as machine learning experts and researchers.

Aleksandrs Berdičevskis, Gerlof Bouma, Robin Kurtz, Felix Morger, Joey Öhman, Yvonne Adesam, Lars Borin, Dana Dannélls, Markus Forsberg, Tim Isbister, Anna Lindahl, Martin Malmsten, Faton Rekathati, Magnus Sahlgren, Elena Volodina, Love Börjeson, Simon Hengchen, Nina Tahmasebi. 2023. Superlim: A Swedish Language Understanding Evaluation Benchmark. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP).

Abstract

We present Superlim, a multi-task NLP benchmark and analysis platform for evaluating Swedish language models, a counterpart to the English-language (Super)GLUE suite. We describe the dataset, the tasks, the leaderboard and report the baseline results yielded by a reference implementation. The tested models do not approach ceiling performance on any of the tasks, which suggests that Superlim is truly difficult, a desirable quality for a benchmark. We address methodological challenges, such as mitigating the Anglocentric bias when creating datasets for a less-resourced language; choosing the most appropriate measures; documenting the datasets and making the leaderboard convenient and transparent. We also highlight other potential usages of the dataset, such as, for instance, the evaluation of cross-lingual transfer learning.

Sahlgren M and Carlsson F (2021) The Singleton Fallacy: Why Current Critiques of Language Models Miss the Point. Front. Artif. Intell. 4:682578. doi: 10.3389/frai.2021.682578

Abstract

This paper discusses the current critique against neural network-based Natural Language Understanding (NLU) solutions known as language models. We argue that much of the current debate rests on an argumentation error that we will refer to as the singleton fallacy: the assumption that language, meaning, and understanding are single and uniform phenomena that are unobtainable by (current) language models. By contrast, we will argue that there are many different types of language use, meaning and understanding, and that (current) language models are build with the explicit purpose of acquiring and representing one type of structural understanding of language. We will argue that such structural understanding may cover several different modalities, and as such can handle several different types of meaning. Our position is that we currently see no theoretical reason why such structural knowledge would be insufficient to count as "real" understanding.

Towards Holistic Disease Risk Prediction using Small Language Models”, Liv Björkdahl, Oskar Pauli, Johan Östman, Chiara Ceccobello, Sara Lundell, Magnus Kjellberg. in ICMLA, 2024

Abstract

Data in the healthcare domain arise from a variety of sources and modalities, such as x-ray images, continuous measurements, and clinical notes. Medical practitioners integrate these diverse data types daily to make informed and accurate decisions. With recent advancements in language models capable of handling multimodal data, it is a logical progression to apply these models to the healthcare sector. In this work, we introduce a framework that connects small language models to multiple data sources, aiming to predict the risk of various diseases simultaneously. Our experiments encompass 12 different tasks within a multitask learning setup. Although our approach does not surpass state-of-the-art methods specialized for single tasks, it demonstrates competitive performance and underscores the potential of small language models for multimodal reasoning in healthcare.

Fortsätt utforska

View in The Edge Lab

AI Labs

På AI Labs formar vi framtiden för artificiell intelligens (AI). Här samarbetar våra partners i utvecklingen av sina förmågor parallellt med att utveckla värdeskapande AI-lösningar och infrastruktur för att skapa AI.

Våra nuvarande fokusområden är inom Natural language understanding (språkteknologier), Edge Learning (mobilitet, hälsa, rymd, infrastruktur) och AI-säkerhet.

Pågående projekt

Three young professionals in a meeting

AI for Impact

Uppmaning till sociala entreprenörer och icke vinstdrivande organisationer AI Sweden driver med finansiering från Google.org projektet AI for Impact som ska stärka svenska sociala entreprenörer och...
AI-generated image showing a group of people in a workshop setting

AI i fokus - kompetensutveckling för folkhögskolor

Projektet syftar till att stärka medarbetarnas kunskaper inom artificiell intelligens, konflikthantering och medie- och informationskunnighet (MIK).
Golden honeycomb pattern on a computer component with a metallic shine

AI-drivna honeypots

Honeypots är datorer som efterliknar verkliga datorsystem för att locka angripare och avslöja deras verktyg och metoder. Kan AI förbättra honeypots förmåga att vilseleda angripare och därmed stärka...
A women putting up a post-it on a whiteboard and a man reading a post-it during a workshop

Datadrivna organisationer – Bästa praxis för AI-implementering i Sverige

Hur kan en organisation bli datadriven på riktigt? Det ska AI Swedens nya MLOps-projekt, med det formella namnet Data-driven organizations – Best practices for operationalization of AI in Sweden , ge...
DeployAi logo on a turquoise background

DeployAI

Projektet DeployAI ska bygga, driftsätta och lansera en fullt operativ AI-on-demand-plattform (AIoDP) som främjar pålitliga, etiska och transparenta europeiska AI-lösningar för användning inom...
Digital Forensics Sweden Project

Digital Forensics Sweden

AI Sweden samarbetar med nätverket Digital Forensics Sweden för att undersöka hur AI kan användas inom digital forensik.
A view of Stockholm

Digital kompetensutveckling, Stockholm Stad

Den digitala utvecklingen accelererar snabbt och påverkar hela samhället. När kompetenskrav och roller förändras behövs kontinuerlig kompetensutveckling för att vara konkurrenskraftig och driva...
A woman standing by a desk in fromt of her computer

En gemensam digital assistent för offentlig sektor

Detta är ett samarbete mellan svenska myndigheter, kommuner, regioner och näringslivet, koordinerat av AI Sweden. Syftet är att främja nationell samverkan kring AI för textuppgifter och skapa...
One of the supercomputors in the Barcelona Supercomputing Center

EuroLingua-GPT

AI Sweden och Fraunhofer Institute for Intelligent Analysis and Information Systems tränar tillsammans en ny modell av stora språkmodeller för alla de officiella europeiska språken.
Cars connected as an illustration of federated fleet learning

Federated Fleet Learning

Som ett resultat av att regler och lagar kring datadelning, säkerhet och lagring förändras, förväntas nuvarande metoder för modellträning att stå inför ökade utmaningar. Målet med detta projekt är att...
Federated Learning In Banking

Federerad maskininlärning i banksektorn

Penningtvätt utgör ett betydande samhällshot eftersom det möjliggör för brottslingar att utnyttja illegala medel, underminerar allmänhetens förtroende och skadar det finansiella systemet. För att...
Photo in black and white of trucks on a parking lot seen from above

FormAI

FormAI utforskar användningen av generativ AI i kombination med formell verifiering för att utveckla säker och pålitlig programvara inom fordonsområdet.
an icon showing a shield with incoming arrow, AI in the center and threat being dissolved into stars

LeakPro: Läckage och risköversyn av maskininlärningsmodeller

Flera undersökningar har belyst möjligheten att extrahera data från tränade maskininlärningsmodeller. Dessa exempel utförs dock vanligtvis under idealistiska förhållanden och det är oklart om risken...
An illustrative composition featuring multiple screens arranged in a circular formation, radiating light from the center

Multimodal språkmodell

Nu tar AI Swedens språkteam nästa stora steg genom att starta utvecklingen av Sveriges första stora multimodala språkmodell. Den nya modellen förväntas, precis som GPT-SW3, bli en viktig nationell...
An AI-generated image showing cars driving and people walking in a city setting

Next generation infrastructure

Inom ramen för Next Generation Infrastructure Project utvecklar AI Sweden nästa generations infrastruktur för träning, distribution och iterativ förbättring av grundmodeller genom att hantera...
Precision forestry project

Precisionsskogsbruk

Konceptet med precisionsskogsbruk innebär användning av högupplöst data för att fatta precisa beslut baserat på data från digitala tvillingar av skogar på träd-nivå. Målet med projektet är att...
Satellite view

Swedish Space Data Lab 3.0

Rymddata kan nyttjas för väderprognoser, övervakning av klimat, skogs- och jordbruk. Behovet av att analysera, tillgängliggöra och samordna data ökar stadigt. Det svenska Rymdatalabbet (Swedish Space...
Photo of a large amounts of text documents laying on a table, seen from above. AI-generated

TrustLLM

TrustLLM kommer att utveckla europeiska stora språkmodeller (LLM) i en aldrig tidigare skådad skala, tränade på den största textmängden hittills inom europeisk AI, som täcker en rad...
AI Generated image showing a group of people in a meeting room, wearing hospital scrubs. Seen through a glass-wall

Utbildningspaket, Hälso- och sjukvård (VGR)

Projektet, vars officiella namn är ‘Utbildningspaket med olika nivåer - från grundnivå till avancerad kunskap om AI ​För stärkt kompetensförsörjning i Västra Götalandsregionen’ syftar till att stärka...