Data generated for the study 'Robust decomposition of cell type mixtures in spatial transcriptomics.' Additionally contains data from previous studies analyzed in our paper.

Code used for processing this data and generating figures in the paper is contained at https://github.com/dmcable/RCTD/tree/dev/AnalysisPaper. 

Data files were obtained as follows:

  • Cerebellum_MappedDGEForR.csv, and Cerebellum_BeadLocationsForR.csv is the new Slide-seq V2 mouse cerebellum dataset generated by this study.
  • Hippocampus_MappedDGEForR.csv and Hippocampus_BeadLocationsForR.csv is a Slide-seq V2 mouse hippocampus dataset generated by the Slide-seq V2 paper (Sensitive spatial genome wide expression profiling at cellular resolution).
  • F_GRCm38.81.P60Hippocampus.cell_cluster_outcomes.RDS.zip, SCRef_hippocampus.RDS.zip, and scRefSubsampled1000_hippocampus.RDS.zip are from the DropViz single-cell RNA-seq hippocampus dataset (http://dropviz.org/).
  • Consulting the DropViz dataset, we generated the subclusterlabels_interneurons.csv and subclusterlabels_coarse_interneurons.csv cell type annotation files for interneurons.
  • 1000cellsSubsampled_cerebellum_singlecell.RDS.zip is from the DropViz single-cell RNA-seq cerebellum dataset (http://dropviz.org/).
  • scRefSubsampled1000_cerebellum_singlenucleus.RDS.zip is from a recent 10x single-nucleus RNA-seq cerebellum study (A transcriptomic atlas of the mouse cerebellum reveals regional specializations and novel cell types).

We now provide descriptions of the files in this repository (RDS files are designed to be loaded into R):

  • Cerebellum_MappedDGEForR.csv and Hippocampus_MappedDGEForR.csv are digital gene expression (DGE) matrices representing the observed counts for each pixel and for each gene in the Slide-seq datasets.
  • Cerebellum_BeadLocationsForR.csv and Hippocampus_BeadLocationsForR.csv are matrices containing the coordinates of each pixel within the Slide-seq datasets.
  • SCRef_hippocampus.RDS, scRefSubsampled1000_hippocampus.RDS, 1000cellsSubsampled_cerebellum_singlecell.RDS, and scRefSubsampled1000_cerebellum_singlenucleus.RDS are RDS Seurat objects containing single-cell RNA-seq datasets. Several of these objects have been downsampled to at most 1000 cells per cell type. These objects contain cell type annotations, total Unique Molecular Identifier counts per cell, and raw gene counts for each cell.
  • F_GRCm38.81.P60Hippocampus.cell_cluster_outcomes.RDS is an R DataFrame containing the cell type and subtype classification for each cell in the hippocampus single-cell RNA-seq dataset. The file subclusterlabels_interneurons.csv provides a name for each interneuron subtype, and the file subclusterlabels_coarse_interneurons.csv provides the interneuron subclass name (as determined in our study) for each interneuron subtype.
  • puckCropped_cerebellum_slideseq.rds: SpatialRNA object for Slide-seq cerebellum (cropped)
  • puckCropped_hippocampus.rds: SpatialRNA object for Slide-seq hippocampus (cropped)

In order to run Robust Cell Type Decomposition (RCTD, https://github.com/dmcable/RCTD), a single cell RNA-seq reference (as a Seurat RDS file) must be inputted along with a spatial transcriptomics dataset. The MappedDGEForR.csv and BeadLocationsForR.csv files in this repository are correctly formatted for RCTD processing.

RCTD results files on the Slide-seq cerebellum can be found in:

  • myRCTD_cerebellum_slideseq.rds: RCTD R object post cell type assignment on the Slide-seq cerebellum
Corresponding authors


Related publications
Robust decomposition of cell type mixtures in spatial transcriptomics
https://www.biorxiv.org/content/10.1101/2020.05.07.082750v1