High definition spatial transcriptomic profiling of immune cell populations in colorectal cancer

Michelli F. Oliveira^†, Juan P. Romero^†, Meii Chung^†, Stephen Williams, Andrew D. Gottscho, Anushka Gupta, Sue Pilipauskas, Syrus Mohabbat, Nandhini Raman, David Sukovich, David Patterson, Visium HD Development Team, Sarah E. B. Taylor^‡

^† These authors contributed equally to this work

^‡ Corresponding author

Abstract

A comprehensive understanding of cellular behaviour and response to the tumor microenvironment (TME) in colorectal cancer (CRC) remains elusive. Here, we introduce the high definition Visium spatial transcriptomics technology (Visium HD) and investigate formalin fixed paraffin embedded (FFPE) human CRC samples. We demonstrate the high sensitivity, single cell-scale resolution, and spatial accuracy of Visium HD, generating a highly refined whole transcriptome spatial profile of CRC samples. We identify transcriptomically distinct macrophage subpopulations in different spatial niches with potential which exert pro- and anti-tumor functions via interactions with tumor and T cells. In situ gene expression analysis validates our findings and localizes a clonally expanded T cell population close to macrophages with anti-tumor features. Our study demonstrates the power of high-resolution spatial technologies to understand cellular interactions in the TME, and paves the way for larger studies that will unravel mechanisms and biomarkers of CRC biology, improving diagnosis and disease management strategies.

Data

The full dataset used in this repository and in the manuscript can be downloaded from the following link Dataset Raw data has also been deposited at GEO under accesion number GSE280318

Repository

This repository contains the scripts to replicate the findings displayed in the manuscript. It is organized into two folders Figures and Methods. The Figures folder has the scripts to replicate the figures in the manuscript and the files are named accordingly. The methods folder contains the different custom methods developed for the manuscript.

The Figures files require specific outputs generated with the Methods scripts.

Methods

In this section with provide a description and a start guide for the different methods used and developed for the manuscript.

AuxFunctions.R

R script with multiple custom R functions used in the manuscript. To load all the functions, we use the source function:

  source("~/HumanColonCancer_VisiumHD/Methods/AuxFunctions.R")

FlexSingleCell.R

R script used to process the FLEX single cell data. It takes the outputs from cellranger aggr.

Given the dataset's large size, we adopted the sketch-based analysis approach in Seurat¹ v5 sketched-based analysis, sampling 15% of the entire dataset (~37,000 cells) for downstream analysis. After completing the analysis on the subsampled data, we extended it to the entire single cell dataset.

The script saves the full processed Seurat object and the Metadata for plotting purposes in the Figure scripts.

saveRDS(ColonCancer_Flex,file='~/Outputs/Flex/FlexSeuratV5.rds') # Full Seurat Object
saveRDS(ColonCancer_Flex@meta.data,file='~/Outputs/Flex/FlexSeuratV5_MetaData.rds') #Meta Data

Deconvolution.R

R script used to run spaceXR² for deconvolution. It requires the UMI count matrix from cellranger aggr and the MetaData generated with the FlexSingleCell.R script to generate the reference. For Visium HD the Space Ranger outs are also required.

Due to the number of barcodes in Visium HD, we modified the source code of spaceXR to improve runtime. The modified version can be found in the following Pull Request. However, the original version can also be used to deconvolve the Visium HD data.

In the script we use sample P1CRC as a template to run the algorithm, but it can also be used for any other sample.

NucleiSegmentation.py

Python script used to run nuclei segmentation on H&E images used for the tissue sections processed with Visium HD. To create the the conda environemnt please see yml section

The script takes an HE image as an imput and performs nuclei segmentation on the full section using the stardist³ package. The user can provide a set of coordinates to generate a crop of the image along with the corresponding masks located within that region. The user also provides the path to the outputs directory for a given bin size (i.e. 2µm) and will output a .csv file that assigns all the barcodes located within the all segmentation masks.

The script can be called as follows:

python ./HumanColonCancer_VisiumHD/Methods/NucleiSegmentation.py -i ./PATH_TO_HE_image -r1 rowmin -r2 rowmax -c1 colmin -c2 colmax -s ./PATH_TO_SR_outs/binned_outputs/square_002um/ -o Output_directory

More details on the required inputs:

    parser.add_argument('-i','--image', type=str, help='Path to HE image')
    parser.add_argument('-r1','--rmin', type=int, help='row min for zoom in')
    parser.add_argument('-r2','--rmax', type=int, help='row max for zoom in')
    parser.add_argument('-c1','--cmin', type=int, help='column min for zoom in')
    parser.add_argument('-c2','--cmax', type=int, help='column max for zoom in')
    parser.add_argument('-s','--srdir', type=str, help='Path to spaceranger outs at a given bin size')
    parser.add_argument('-o',"--out", type=str,help="Directory where to save outputs")

Outputs:

Nuclei_Barcode_Map.csv csv file with the Nuclei and barcode relationship for the full section
labels_FullSection.pckl Labels of the identified nuclei for the full section
polys_FullSection.pckl Coordinates of the identified polygons for the full section.
img_rois_Stardist_Subset.zip if coordinates are given, segmented nuclei for the selected region. Can be visualized with QuPath⁴
img_Stardist.tif if coordinates are given, tif file with the selected zoom in region. Can be visualized with QuPath⁴

environment_nucleisegmentation.yml

yml file to create a conda environment with all the required dependencies for the NucleiSegmentation.py script. To create the evironment using the provided file:

conda env create --name NucleiSeg --file=./HumanColonCancer_VisiumHD/Methods/environment_nucleisegmentation.yml

To activate the environment:

conda activate NucleiSeg

MetaData

The MetaData folder contains files with the associated metadata used in the manuscript.

Single Cell

The SingleCell_MetaData.csv.gz contains the following columns:

Barcode : cell barcode
Patient : Patient of origin
BC : Probe barcode to identify sample of origin
QCFilter : Binary column denoting if a cell was kept or removed during QC
Level1 : Level 1 cell type annotation
Level2 : Level 2 cell type annotation
UMAP1 : UMAP dimension 1 coordinates
UMAP2 : UMAP dimension 2 coordinates

Visium HD

The parquet files (i.e P1CRC_Metadata.parquet) can be opened in R using the following code:

  library(arrow)
  Data<-read_parquet("~/HumanColonCancer_VisiumHD/MetaData/P1CRC_Metadata.parquet")

These parquet files contain the following columns:

barcode : 8um bin barcode
tissue : Binary column denoting if the bin is under tissue or not
X : Spatial X coordinate
Y : Spatial Y coordinate
DeconvolutionClass : Deconvolution class for the bin (singlet, doublet, doublet_certain,doublet_uncertain or reject)
DeconvolutionLabel1 : Gives the first cell type predicted on the bin
DeconvolutionLabel2 : Gives the second cell type predicted on the bin (Not valid for reject or doublet_uncertain)
Periphery : Indicates if the bin is in the 50 micron tumor periphery, in the tumor or rest of the tissue
UnsupervisedL1 : Merged unsupervised clusering annotation (Level 1)
UnsupervisedL2 : Merged unsupervised clusering annotation (Level 2)
MacrophageSubtype : Subtype of macrophage (SELENOP+ or SPP1+) in the tumor periphery
GobletSubcluster : Goblet subcluster used in Figure 5

Figures

The Figures folder contains all the scripts to create the figures used in the manuscript. Most of the scripts within this folder require outputs generated from the Methods section.

The required R packages are common across the files:

library(Seurat)
library(scattermore)
library(tidyverse)
library(data.table)
library(wesanderson)
library(patchwork)
library(RColorBrewer)
library(furrr)
library(paletteer)
library(arrow)
library(pheatmap)
library(RColorBrewer)
library(distances)
library(rhdf5)
library(glue)
library(Matrix)
library(ggpubr)
library(ggeasy)
library(arrow)

The beginning of each file starts with a data.frame that can be used as a template to generate the output for the different sections. We use P1CRC as a template, but can be replaced with any other section.

SampleData<-data.frame(Patient = "PatientCRC1", # Name of the Sample
                       PathSR="~/VisiumHD/PatientCRC1/outs/", # Path to space ranger outs folder
                       PathDeconvolution="~/Outputs/Deconvolution/PatientCRC1_Deconvolution_HD.rds") #spaceXR Deconvolution results

References

Hao, Yuhan, et al. "Dictionary learning for integrative, multimodal and scalable single-cell analysis." Nature biotechnology 42.2 (2024): 293-304. Paper
Cable, Dylan M., et al. "Robust decomposition of cell type mixtures in spatial transcriptomics." Nature biotechnology 40.4 (2022): 517-526. Paper
Weigert, Martin, and Uwe Schmidt. "Nuclei instance segmentation and classification in histopathology images with stardist." 2022 IEEE International Symposium on Biomedical Imaging Challenges (ISBIC). IEEE, 2022. Paper
Bankhead, Peter, et al. "QuPath: Open source software for digital pathology image analysis." Scientific reports 7.1 (2017): 1-7. Paper

bioRxiv version

Here we list the git tag and link to the preprint initially submitted to bioRxiv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

High definition spatial transcriptomic profiling of immune cell populations in colorectal cancer

Abstract

Data

Repository

Methods

AuxFunctions.R

FlexSingleCell.R

Deconvolution.R

NucleiSegmentation.py

environment_nucleisegmentation.yml

MetaData

Single Cell

Visium HD

Figures

References

bioRxiv version

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Figures		Figures
MetaData		MetaData
Methods		Methods
ext		ext
LICENSE.md		LICENSE.md
README.md		README.md
Team		Team

License

10XGenomics/HumanColonCancer_VisiumHD

Folders and files

Latest commit

History

Repository files navigation

High definition spatial transcriptomic profiling of immune cell populations in colorectal cancer

Abstract

Data

Repository

Methods

AuxFunctions.R

FlexSingleCell.R

Deconvolution.R

NucleiSegmentation.py

environment_nucleisegmentation.yml

MetaData

Single Cell

Visium HD

Figures

References

bioRxiv version

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages