Skip to main content

How AI Sweden’s Data Factory enabled data scientists to leapfrog the drug production process

Monday, December 14, 2020

Earlier in fall 2020, AI Sweden and the biopharmaceutical company AstraZeneca challenged experts in machine learning to find a solution which could avoid toxic preprocessing of cell cultures when testing new drugs. The winning solution and its algorithm will be used to help AstraZeneca accelerate the drug development process. But the foundation of it all, allowing AstraZeneca to make its data available for the challenge, was AI Sweden's Data Factory. Solving the necessary legal requirements as well as providing world-class compute power, the Data Factory was a key factor behind the success of the challenge as a whole.

The Data Factory provides a legal and technical framework for AI Sweden's partners to donate and gain access to data, as well as to use storage and compute power for AI projects. The data that the teams analyzed for the Adipocyte Cell Imaging Challenge was donated by AstraZeneca and consisted of a multitude of cell images.

One tool commonly used to develop systems for targeted drugs is high-resolution microscopy. Changes in cell structures are analyzed to assess the effect of different types of drugs. In order for the microscope to be able to differentiate between different cell structures in the cell images, it is required to first preprocess the cells. This is both time consuming and costly. In addition, the preprocessing can affect the cells, which interferes with the end result and makes it difficult to collect reliable data over time, for example when studying the effect of drug intake over a longer period of time. 

This is where machine learning and the Adipocyte Cell Imaging Challenge comes into the picture. Eight teams of Swedish and international representatives from both academia and the private sector participated in the challenge. The task was to use machine learning to accelerate the drug development process by sidestepping the need to preprocess the cells. 

“We at AstraZeneca will be able to benefit from the results from the Adipocyte Cell Imaging Challenge immediately. Our partnership with AI Sweden and the teams’ contributions have opened up for both new ideas and new collaborations. It will help us leapfrog the production process, increase our capacity and potentially bring new drugs to the market even quicker.” -Anders Holmén, Vice President and Head of Pharmaceutical Sciences IMED, AstraZeneca

Alan Sabirsh, Principal Scientist at AstraZeneca, explained that one of the driving factors for AstraZeneca to share their data with AI Sweden and the Data Factory for the hackathon is that the success of AI solutions is often proportional to the quality of the data. Finding efficient and innovative ways to work with data and increase the quality is therefore key. 

The winning team, HASTE team, consisted of researchers from Uppsala University and was selected by a jury consisting of representatives from AstraZeneca, Vinnova and AI Sweden. Their solution uses machine learning to analyze the grayscale images of cells that are not preprocessed. They used a technology that focuses on collecting data points to create an overview of the cell structure, while specifically searching for information about the cell's core, so-called privileged information. Thanks to this, it is possible to skip the preprocessing of cells and immediately gain an understanding of how the cell is structured and reacts to new drugs.

“The main advantage that Data Factory offered us was its ability to train models with a much higher speed. Running on the A100 GPU approximately halved our training time and helped us try novel ideas quickly. Working with Data Factory was a true privilege for us as machine learning researchers. As dataset sizes and model capacities grow larger, Data Factory becomes more and more relevant to use for any data scientist.” -Ankit Gupta, Uppsala University and member of the winning HASTE team

Sheetal Reddy, data scientist at AI Sweden and one of the organisers behind the challenge, underlined that Data Factory provides the best GPU available on the market. She explained that it is container based, so that partners can come up with their own packages and still be supported by Data Factory. “Data Factory’s compute power is state of the art and ensures stability and reliability when running extensive deep learning models, allowing partners to experiment fast,” she says. She stressed that the solutions that all teams came up with were excellent and that it is great to now see the potential for further collaboration.

AstraZeneca echoed the importance of partnership, and Alan Sabirsh emphasized that they look forward to collaborating with several of the participating teams going forward. “The Adipocyte Cell Imaging Challenge has given us new avenues for collaboration and we will implement the solutions in our research - it was really impressive,” Sabirsh explains.

The data that the teams analyzed for the Adipocyte Cell Imaging Challenge is now available for AI Sweden's partners and several of the teams will continue to collaborate and learn from each other’s solutions while working in the Data Factory. The codes and reports from the hackathon are also available open source on GitHub. 

Read more about the Adipocyte Cell Imaging Challenge.

Listen to Alan Sabirsh introduce the Adipocyte Cell Imaging Challenge in more detail during the AI Sweden Partner Week.

Contact Ebba Josefson Lindqvist to get involved or learn more about the Data Factory!

Project Manager Data Factory

Ebba Josefson Lindqvist

+46 (0)73- 254 29 03