Skip to main content


A key element for the development of AI is good data. AI Sweden and the Data Factory offer partners a number of suitable datasets, and new datasets will continuously be added.

Our datasets consist of different categories of data and can be used for different purposes. Most of the datasets are annotated and you will find more information about each dataset below.

Urban traffic scene with a pedestrian crossing, viewed from the perspective of a driver

Zenseact Edge AnnotatationZ Dataset

This dataset is provided by Zenseact with the intent of bringing real-world challenges to academia and startups. It consists of 6,666 image sequences captured by Zenseact developmental vehicles and...
VAMLAV dataset

VAMLAV Dataset

The VAMLAV dataset is a large autonomous driving (AD) dataset, created by Zenseact with support from Astazero and RISE. It was collected over a 2-year period in the rural road at Astazero test track...
SMIRK dataset

SMIRK Dataset

The dataset contains 4,928 scenarios of pedestrians crossing, or moving close to, a straight road at different speeds and trajectories in relation to the camera. Moreover, analogous scenarios of basic...
Highway Dataset

Highway Dataset

The dataset has been made for the development of autonomous vehicles (AVs). Volvo Autonomous Solutions, Volvo Group, has collected several highway scenes in the Gothenburg region. The dataset includes...
Baltic Seabird Dataset

Baltic seabird dataset

The dataset consists of 2000 hours of video footage of guillemots on a ledge on Stora Karlsö, Sweden. Using AI, SLU has utilized the dataset to study and automate documentation of the birds' behaviors...
Adipocyte Cell Imaging Dataset

Adipocyte Cell Imaging Dataset

The dataset consists of images of the adipocytes (fat cells) taken using transmission light microscopy in the form of TIF files. There are three sets of images corresponding to three different...

AI Sweden requires that any data that is shared comply with all applicable regulations. We constantly work with legal questions related to data sharing, GDPR (General Data Protection Regulation), etc. to address compliance requirements. Read more about the work that is being done related to legal questions here.

The datasets that are donated or licensed to the Data Factory from our partners will be made available in accordance with the terms & conditions for each dataset and you will be able to find the terms in connection to each dataset. A couple of our datasets have their own terms and conditions and some of them have Creative Commons licenses. These licenses allow others to use, share, and build upon the creator's work, but under certain conditions. There are several different types of Creative Commons licenses, each with its own set of conditions. The most common types are:

  • Attribution (CC BY): This license allows others to use the work, even for commercial purposes, as long as they give credit to the original creator.
  • Attribution-ShareAlike (CC BY-SA): This license is similar to the Attribution license, but it also requires that any derivative works are also licensed under the same terms.
  • Attribution-NonCommercial (CC BY-NC): This license allows others to use the work, as long as it is not for commercial purposes, and credit is given to the original creator.
  • Attribution-NoDerivs (CC BY-ND): This license allows others to use the work, as long as it is not changed in any way and credit is given to the original creator.

When using any of the datasets licensed under a Creative Commons license, it is important to make sure that you understand and comply with the terms of the license. If you are looking for more information about Creative Commons licenses, please consult the document below.

Guide for Creative Commons license


Become a partner and engage in the Data Factory

The datasets are available for all AI Sweden partners. Contact Chiara Ceccobello and she will give you further instructions on how to access the datasets. If you are interested in becoming a partner of AI Sweden, getting access to the partner benefits, including the Data Factory and datasets, or sharing a dataset or a model, please feel free to reach out.

Chiara Ceccobello
Chiara Ceccobello
Data Scientist