8 files

SMHI IFCB plankton image reference library

Version 2 2024-06-05, 07:30
Version 1 2024-05-31, 21:10
posted on 2024-06-05, 07:30 authored by Anders TorstenssonAnders Torstensson, Ann-Turi Skjevik, Malin Mohlin, Maria Karlberg, Bengt KarlsonBengt Karlson

This repository includes three datasets of manually annotated plankton images by phytoplankton experts at the Swedish Meteorological and Hydrological Institute (SMHI). These images can be used for training automatic image classifiers to identify various plankton species. The images were captured using an Imaging FlowCytobot (IFCB, McLane Research Laboratories) from different locations and seasons in the Skagerrak, Kattegat, and Baltic Proper. The specifics of the three datasets are as follows:

  1. smhi_ifcb_svea_baltic_proper: Images were gathered during monthly monitoring cruises from 2022 to 2024, utilizing an IFCB mounted as part of the underway FerryBox system on the R/V Svea. This collection consists of 22,805 annotated images across 57 different classes.
  2. smhi_ifcb_svea_skagerrak_kattegat: Images were also collected during the regular monitoring cruises from 2022 to 2024. This archive comprises of 5,086 annotated images from 83 distinct classes.
  3. smhi_ifcb_tångesund: In 2016, the IFCB was deployed in situ at depths between 3 and 18 meters, near a mussel farm in Tångesund, Mollösund (Skagerrak). This dataset contains 43,634 annotated images from 33 different classes.

Each dataset comprises two zip archives: one (annotated_images) containing .png images organized into subfolders for each class, and another (matlab_files) including raw data files (.roi, .hdr, .adc) and .mat-files for developing a random forest image classifier using the MATLAB code available at

The images in this dataset undergo continuous quality control, and new images are regularly added. Consequently, this dataset will be updated on a regular basis. If you find any mislabeled images, please contact the authors.

Version history

  • Version 2 (2024-06-03): 71,525 annotated images. Updated class names and corrected manual files in the Tångesund dataset. Continued quality control of images in the Tångesund dataset.
  • Version 1 (2024-05-31): 65,435 annotated images


DTO-Bioflow FSTP

Swedish Biodiversity Data Infrastructure

Swedish Research Council

Find out more...

Joint European Research Infrastructure of Coastal Observatories: Science, Service, Sustainability - JERICO-S3



Swedish Meteorological and Hydrological Institute (SMHI)