scMIR Project

Pri-miRNA quantification from single-cell RNA-seq datasets

MiRNA genes called pri-miRNAs are transcribed in a similar fashion that protein-coding genes. We used the annotation developed in Bouvy-Liivrand et al, 2017 and extended in Turunen et al, 2021 to quantify pri-miRNA transcripts from single-cell RNA-seq datasets (Hernández de Sande el at, 2023 bioRxiv https://doi.org/10.1101/2023.10.09.561173).

You can find code related to the analysis in this repository. The data objects with miRNA gene counts and cell type metadata are hosted at https://kana.rahtiapp.fi/ for interactive analysis.

STEP 0. Custom Reference Genome.

A new gtf file was generated by combining annotated genes and pri-miRNAs and filtered for the cellranger pipeline using cellranger mkgtf command. Finally, both (i) the custom gtf file and (ii) fasta file from the chosen genome are used to generate a new custom genome with cellranger mkref command.

1. Atlas Dataset Analysis

Tabula Muris Senis dataset (TMS).

This consortium generated single-cell RNA-seq libraries prepared with either 10x Genomics droplet technology or by sorting the cells and capturing RNA-seq profiles with Smart-seq2 technology. In this example analysis we used data from tissues including Spleen, Liver, Heart and Aorta, Fat and Bown Marrow at 1m, 3m, 18m, 21m, 24m and 30m of age, as available.

  1. TMS datasets were download from the amazon cloud using the following commands:
  1. Quantification of scRNA-seq datasets using the custom genome containing genes and pri-miRNAs.

$cellranger_folder=/cellranger-3.1.0/
$fastqs={list of folders containing fastqs correspoding to the samples as in TMS_Datasets_Summary.cvs "sample.id"}
$id=output folder
$ref=custom genome
$localcores=cpus used to run the analysis
$localmem=local memory reserved for running the analysis

{cellranger_folder}/cellranger count --id=$id"_primirs" --transcriptome=$ref --fastqs=$fastq_files --sample=$id --localcores=$localcores --localmem=$localmem --disable-ui
  1. Both 10x and Plate-seq datasets were filtered (1_filtering_droplet.ipynb and 1_filtering_plate_seq.ipynb) and cell type markers were detected using Scanpy. Comparison of old vs young cells was performed comparing the gene expression distributions with the tool scDD.

Repository Contents