Single-cell morphological profiling reveals insights into cell death
Dataset description:
The data Organization of files:
1) Features: features.tar.gz
- singlecell_features_CellProfiler.parquet: This file contains the single-cell profiles extracted with CellProfiler used for the analysis in this publication. Features are normalised and filtered according to Fig. 1B, C in the paper.
- singlecell_features_DeepProfiler.parquet: This file contains the single-cell profiles extracted with DeepProfiler used for the analysis in this publication. Features are normalised and filtered according to Fig. 1B, C in the paper.
- singlecell_features_DINO.parquet: This file contains the single-cell profiles extracted with DINO used for the analysis in this publication. Features are normalised and filtered according to Fig. 1B, C in the paper (number of profiles here is smaller than in the other two approaches as outlined in the paper).
- aggregate_profiles_DINO_adjusted.parquet, aggregate_profiles_CP_aggregated.parquet, aggregated_profiles_DP_adjusted.parquet: Aggregated profile parquets for the three feature extractors.
2) Metadata: metadata_celldeath_paper.csv. This file contains the metadata used in the orgiinal cell painting experiment. It contains, plate, well, site (field-of-view), compounds, moa as well used concentrations and treatment conditions.
3). Grit scores: grit_scores.tar.gz. This zipped folder contains the grit scores for the compound concentrations for all three feature extractors. This info is provided for all the compounds concentrations for which grit could be computed. One file each for CellProfiler, DeepProfiler and DINO.
4.) E-distance: edistance.tar.gz. This zipped folder contains the edistances and etest results for the compound concentrations for all three feature extractors. This info is provided for all the compounds concentrations for which they could be computed. One file each for CellProfiler, DeepProfiler, and DINO. The file names indicate the number of samples and permutations used in the permutation test.
5.) Splits: splits.tar.gz. This zipped folder contains the splits used in the supervised model training. One file for each CellProfiler, DeepProfiler, and DINO. As described in the paper, splits were performed based on the wells of plates. Each file contains moa, compound, plate, well as well as split and the fraction of cells in each well.
6.) map: map.tar.gz. This zipped folder contains map results for the compound concentrations for all three feature extractors. This info is provided for all the compounds concentrations for which they could be computed. One file each for CellProfiler, DeepProfiler, and DINO. The file names indicate the number of samples and permutations used in the permutation test.
7) Embeddings: embeddings.tar.gz contains .h5ad with anndata files of calculated single-cell embeddings for CellProfiler, DeepProfiler, and DINO features. Each anndata file contains features, metadata as well as calculated UMAP, and PCA embeddings used in the figures. Additionally contains one anndata file with the DINO apoptosis embedding used to generate Fig. 3.
8) QC: qc_df.csv, File containing quality control flags used to filter out images and calculated viability values from cell counts.
9) Classification splits: classification_splits.tar.gz contains specific datasets used in training of supervised models for DINO, CellProfiler, and DeepProfiler on both aggregated and single-cell level.
Publication:
The data in this repository supports the following publication:"Single-cell morphological profiling reveals insights into programmed cell death" by Frey et al.
Abstract:
Analysis at the single-cell level is a powerful approach to study biological processes and responses to perturbations. However, its application in morphological profiling with phenomics remains underexplored. Here, we use the Cell Painting assay to investigate morphological effects of 53 small molecule compounds, associated with six distinct cell death mechanisms, across six concentrations in MCF7 cells. To compare single-cell and aggregated analysis strategies, we conduct both supervised and unsupervised evaluations aimed at identifying features linked to programmed cell death. We apply an energy distance as a metric to quantify morphological perturbation strength, enabling efficient filtering. Among three tested feature extraction methods, self-supervised DINO embeddings applied to single-cell data captured high-resolution morphological patterns. Focused analyses of apoptosis-inducing compounds revealed biological heterogeneity attributable to specific molecular targets and concentration-dependent effects, which were not apparent in aggregated profiles. In contrast, multi-class classification models for the six programmed cell death mechanisms trained on single-cell features achieved F1 scores of 79.86\%, while models trained on aggregated features reached F1 scores of up to 89.97\%.
Our results highlight the advantages of single-cell data for unsupervised exploration and show that aggregated representations yield more robust and accurate performance in supervised models.
Funding
Enabling Systematic Phenotypic Cell Profiling in Safety Pharmacology
Swedish Research Council
Find out more...Swedish Research Council 2024-04576
Swedish Research Council 2024-03566
SynMix: Improving mechanistic understanding of chemical mixtures using large-scale cell profiling in an automated laboratory
Swedish Research Council for Environment Agricultural Sciences and Spatial Planning
Find out more...Swedish Cancer Foundation (22 2412 Pj 03 H)
BUILDING A SUSTAINABLE EUROPEAN INNOVATION PLATFORM TO ENHANCE THE REPURPOSING OF MEDICINES FOR ALL
European Commission
Find out more...