Cancer | Literature | PMID | Sequencing platform | Cell number | Data processing | Resource |
---|---|---|---|---|---|---|
Breast cancer | A single-cell and spatially resolved atlas of human breast cancers | 34493872 | Illumina NextSeq 500 | 130,246 | The EmptyDrops method from the DropletUtils package was applied for cell filtering with additional cutoffs for cells with a gene and unique molecular identifier (UMIs) count greater than 200 and 250, respectively, and a mitochondrial percentage less than 20%. |
GSE176078 |
Non-small cell lung cancer | Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing | 29942094 | Illumina Hiseq 2500 or Illumina Hiseq 4000 | 12,346 | Low-quality cells were discarded if the cell library size or the number of expressed genes (counts larger than 0) was smaller than pre-defined thresholds, which were the medians of all cells minus 3 × median absolute deviation. Cells were also removed if their proportions of mitochondrial gene expression were larger than 10%. Only cells with the average TPM of CD3D, CD3E and CD3G larger than 10 were kept for subsequent analysis. |
GSE99254 |
Lung cancer | Integrated single-cell RNA sequencing analysis reveals distinct cellular and transcriptional modules associated with survival in lung cancer | 35027529 | Illumina NovaSeq 6000 | 220,716 | Samples with less than 500 cells were removed. Cells were required to have more than 1000 UMIs and only genes with more than 1000 UMIs across all cells were kept for further analyses. |
http://lungcancer.chenlulab.com/#/download |
Lung adenocarcinoma | Single-cell RNA sequencing reveals distinct tumor microenvironmental patterns in lung adenocarcinoma | 34663877 | Illumina HiSeq 4000 | 114,489 | Transcriptomes were filtered for cells with 500–10,000 genes detected, 1000–100,000 UMIs counted, fraction of mitochondrial reads <30%, and fraction of hemoglobin reads <5%. |
Code Ocean capsule from 10.24433/CO.0121060.v1. |
Lung cancer | Therapy-Induced Evolution of Human Lung Cancer Revealed by Single-Cell RNA Sequencing | 32822576 | Illumina NextSeq or NovaSeq 6000 | 23,261 | Standard procedures for filtering were performed using the Seurat v3 using R, where cells with fewer than 500 genes and 50,000 reads were excluded. DoubletFinder was used to identify potentially sorted doublet cells. |
NCBI BioProject #PRJNA591860 |
Gastric cancer | Single-cell RNA sequencing reveals a pro-invasive cancer-associated fibroblast subgroup associated with poor clinical outcomes in patients with gastric cancer | 34976204 | Illumina HiSeq 4000 | 36,897 | Cells with fewer than 400 expressed genes, as well as genes expressed in less than four cells, were removed. |
wxy@ibms.pumc.edu.cn |
Gastric cancer | Single-Cell Genomic Characterization Reveals the Cellular Reprogramming of the Gastric Tumor Microenvironment | 32060101 | Illumina sequencer | 56,167 | Cells that expressed fewer than 200 genes, had greater than 20% mitochondrial genes or had number of UMI in an outlier range indicative of potential doublets were removed. The authors also excluded genes detected in fewer than three cells. |
genomics_ji@stanford.edu |
Hepatocellular carcinoma | Single-cell landscape of the ecosystem in early-relapse hepatocellular carcinoma | 33357445 | BGISEQ500 | 16,498 | The authors defined genes with TPM > 1 as detected genes. To filter out low-quality cells they set the following criterion: 1). Mapping reads ≥ 1 M; 2). Mapping rate ≥ 30%; 3). 1,500 ≤ detected genes number ≤ 10,000. |
fan.jia@zs-hospital.sh.cn |
Liver cancer | A single cell atlas of the human liver tumor microenvironment | 33332768 | NextSeq 550 | 7,947 | Cells with UMI counts below 200 or higher than 3,000 or mitochondrial content above 35% were removed. |
GSE146409 |
Pancreatic ductal adenocarcinoma | Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma | 31273297 | Illumina HiSeq X Ten | 57,530 | Low quality cells (<200 genes/cell, <3 cells/gene and >10% mitochondrial genes) were excluded. |
GSA:CRA001160 |
Prostate cancer | Single-cell analysis of human primary prostate cancer reveals the heterogeneity of tumor-associated epithelial cell states | 35013146 | Seq-Well | 21,743 | Cells with less than 300 genes, 500 transcripts, or a mitochondrial level of 20% or greater, were filtered out. Then, an upper threshold for the number of genes per cell in each individual sample was set in order to filter potential doublets. |
GSE176031 |
Renal cell carcinoma | Identification of a novel cancer stem cell subpopulation that promotes progression of human fatal renal cell carcinoma by single-cell RNA-seq analysis | 33162821 | Illumina Hiseq X | 15,208 | To guarantee the quality of sequencing, the cells with <200 or > 5000 genes were depleted from the original data |
cuixingang@smmu.edu.cn |
Renal cell carcinoma | Single-cell transcriptomics reveals a low CD8+ T cell infiltrating state mediated by fibroblasts in recurrent renal cell carcinoma | 35121646 | Illumina NovaSeq 6000 | 32,073 | Low-quality cells were removed following 3 measurements: 1) cells had either fewer than 200 or over 6000 unique molecular identifiers (UMIs), over 20,000 or less than 200 expressed genes or over 15% UMIs derived from the mitochondrial genome, or over 2.5% UMIs derived from the erythrocytic genome; 2) cells had an average expression level of less than 2 for a curated list of housekeeping genes; 3) cells had a co-expression of EPCAM and PTPRC. 4) Doublets were detected by DoubletFinder R package for single sample and manually detected the doublets in re-clustering the cell types. |
zhangzhl@sysucc.org.cn |
Colorectal cancer | Multiregion single-cell sequencing reveals the transcriptional landscape of the immune microenvironment of colorectal cancer | 33463049 | BGISEQ500 | 15,115 | Cells with less than 500 genes (TPM > 1) or over 20% TPM derived from the mitochondrial genome were removed. |
CNGB Nucleotide Sequence Archive; CNP0000916 |
Head and neck squamous cell carcinoma | Investigating immune and non-immune cell interactions in head and neck tumors by single-cell RNA sequencing | 34921143 | Illumina NextSeq 500/550 | 134,606 | Based on the QC metrics suggested in the Scanpy tutorial, cells with less than 200 genes expressed were filtered out. Cells expressing more than 5000 genes, and more than ten percent mitochondrial genes were also removed. Genes expressed in less than 3 cells were also filtered out of the analysis. |
NCBI Sequence Read Archive: accession ID SRP301444. |
Head and neck squamous cell carcinoma | Immune Landscape of Viral- and Carcinogen-Driven Head and Neck Cancer | 31924475 | Illumina NextSeq 500 | 131,224 | After creation of the gene/barcode matrix, a cell-level filtering step was performed to remove cells with either few genes per cell (<200) or many molecules per cell (>20,000). Next, genes that were lowly expressed (fewer reads than 3 counts in 1% of cells, or genes expressed in fewer than 1% of cells) across all samples were removed. |
GSE139324 |
Nasopharyngeal carcinoma | Tumour heterogeneity and intercellular networks of nasopharyngeal carcinoma at single cell resolution | 33531485 | Illumina HiSeq X Ten | 176,447 | The R package “DoubletFinder” was applied to predict doublets in the data. The authors removed doublets in each sample individually, with an expected doublet rate of 0.05 and default parameters used otherwise. Next, any cells were removed for which had either less than 101 UMIs, or expression of less than 501 genes, or over 15% UMIs linked to mitochondrial genes. |
GSE162025 |
Neuroblastoma | Single-cell transcriptomic analyses provide insights into the developmental origins of neuroblastoma | 33767450 | Illumina NextSeq 500 | 100,337 | The R package Seurat was used to calculate the quality control metrics35. Cells were removed from the analysis if fewer than 500 distinct genes, 1,000 counts or more than 2.5% of reads mapping to mitochondrial genes were detected, for data generated with the Chromium Next GEM Single Cell 3' Kit v.3.1 (10x Genomics). For the Chromium Single Cell 3' Kit v.2 (10x Genomics) data, cells with fewer than 300 distinct genes, 1,000 counts or more than 2.5% of reads mapping to mitochondrial genes were filtered. Doublets were detected and filtered using the R package DoubletFinder with default settings. Genes that were expressed in fewer than three cells were excluded. |
GSE163431 |
Esophageal squamous cell carcinoma | Dissecting esophageal squamous-cell carcinoma ecosystem by single-cell transcriptomic analysis | 34489433 | Illumina HiSeq X Ten | 208,659 | For quality filtering, the authors removed genes whose expressions were detected in <0.1% of all cells and filtered out cells that had gene counts <500 or mitochondrial RNA content >20%. The Seurat package (version 2.3.4) was used for quality filtering. |
GSE160269 |
Esophageal squamous cell carcinoma | Integrated single-cell transcriptome analysis reveals heterogeneity of esophageal squamous cell carcinoma microenvironment | 34921160 | Illumina Hiseq X (PE150) | 62,161 | Potential doublets were detected and filtered using DoubletFinder based on the expression proximity of each cell to artificial doublets. Further, cells with high mitochondrial content (>= 20%) were removed. |
Sequence Read Archive (SRA) under accession number PRJNA777911. |
Cervical cancer | Single-Cell RNA Sequencing Reveals Multiple Pathways and the Tumor Microenvironment Could Lead to Chemotherapy Resistance in Cervical Cancer | 34900703 | Illumina NovaSeq 6000 | 24,371 | The number of unique molecular identifiers (UMIs), the number of genes, and the percentage of mitochondrial genes were examined for quality control. Cells expressing <500 or >4,000 genes (potential cell duplets) and gene expression not detected in fewer than three cells were trimmed from the library. |
shenchao@whu.edu.cn |
Multiple myeloma | Single-cell RNA sequencing infers the role of malignant cells in drug-resistant multiple myeloma | 34918874 | Illumina HiSeq X Ten | 52,793 | To obtain cells with high quality, the ratio of mitochondria lower than 0.2 and cells with genes over 2000 were maintained. |
wangliangtrhos@126.com |
Endometrial carcinoma | Phenotyping of immune and endometrial epithelial cells in endometrial carcinomas revealed by single-cell RNA sequencing | 33429363 | Illumina HiSeq X Ten | 30,780 | Genes detected in < 3 cells and cells where < 100 genes had nonzero counts were excluded. Low-quality cells that had > 5% mitochondrial genes were discarded. |
The SRA accession number is PRJNA650549. |
Osteosarcoma | Single-cell RNA landscape of intratumoral heterogeneity and immunosuppressive microenvironment in advanced osteosarcoma | 33303760 | Illumina HiSeq X | 100,987 | The cells with no. of expressed genes <300 genes or the percent of mitochondrial genes over 10% of total expressed genes were filtered out. Further, the DoubletFinder package of the R was used to remove the potential doublets (and to an even lesser extent of higher-order multiplets) that occurred in the encapsulation step and/or as occasional pairs of cells that were not dissociated in sample preparation. |
GSE152048 |
Ovarian cancer | Identification of grade and origin specific cell populations in serous epithelial ovarian cancer by single cell RNA-seq | 30383866 | Illumina NextSeq 500 | 2,911 | The R software package Seurat was used for further analysis. Genes were initially filtered on expression in at least three cells and each cell needed to have at least 200 genes expressed. |
GSE118828 |
Uveal melanoma | Single-cell analysis reveals new evolutionary complexity in uveal melanoma | 31980621 | Illumina NextSeq 500 | 59,915 | Filtering was conducted by retaining cells that had unique molecular identifiers (UMIs) greater than 400, expressed 100 and 8000 genes inclusive, and had mitochondrial content less than 10 percent. |
GSE139829 |
T-cell lymphoma | Single-cell RNA sequencing reveals markers of disease progression in primary cutaneous T-cell lymphoma | 34583709 | Illumina NovaSeq 6000 | 47,172 | The command “doubletCells” simulates thousands of doublets by adding together two randomly chosen single cell profiles. For each cell the number of simulated doublets in the neighborhood was recorded and used as input to calculate a doublet score. Threshold to filter putative doublets was set to three times the median absolute deviation of the doublet score and all cells with a higher score were discarded. |
GSE173205 |
Thyroid cancer | Characterizing dedifferentiation of thyroid cancer by integrated analysis | 34321197 | Illumina NovaSeq | 46,205 | Several criteria were set to filter low-quality cells and genes: minimal expression of 200 genes per cell, mitochondrial content less than 15%, and genes that are expressed in more than 3 cells. |
Access number: HRA000686, https://bigd.big.ac.cn/gsa-human/browse/. |
On the Browse page, users can browse SCancerRNA by clicking on diagrams related to the categories (RNA type, biological function, clinical application and tissue) listed above. The result page is shown in the figure below.
Biomarker Result
1. The results for non-coding biomarkers are displayed.
2. Each entry includes the name of the non-coding RNA biomarker, the RNA type of the biomarker, the type of the cancer and the testing methods of the non-coding RNA biomarker.
3. Users can explore whether this biomarker is related to biological function and clinical application through T or F. The specific biological functions (cell proliferation, growth, apoptosis, autophagy and epithelial mesenchymal transformation) and clinical applications (migration, metastasis, circulation, survival and recurrence) of biomarkers can be checked by clicking ‘Detail’ button.
T:This biomarker is associated with this listed biological function or clinical application.
F:This biomarker is not associated with this listed biological function or clinical application.
4. Users are allowed to acquire more detailed information in the original literature corresponding to the biomarker by clicking the 'PMID' link.
5. Users can click the ‘more details’ button to check detailed information for the ncRNA biomarker.
6. By clicking on the network logo, the interaction network of different types of ncRNA biomarkers will be shown.
7. Biomarker results can be downloaded in excel or csv format.
8. Input an interested non-coding RNA biomarker for search.
Single cell Result
1. The results of single-cell sequencing analysis for the corresponding genes of the biomarkers are displayed.
2. Each entry includes the name of the gene, the corresponding biomarker, the RNA type of the biomarker and the cancer implicated in single-cell sequencing analysis.
3. Users can explore the average log2 fold change value, adjusted p-value and description of the gene in the differential expression analysis at the single-cell level.
4. Users can obtain sequencing platform information and quality control steps in single-cell sequencing analysis.
5. Users are allowed to acquire more detailed information in the original literature by clicking the 'PMID' link.
6. Single cell results can be downloaded in excel or csv format.
7. Input an interested gene or RNA biomarker for search.
1. Users are allowed to search for non-coding RNA biomarkers by RNA name or gene name.
2. Input an interested cancer type to search for non-coding biomarkers.
3. Select some interested biological functions for advanced search.
4. Select some interested clinical applications for advanced search.
SCancerRNA provides two modules on the ‘single cell’ page, which allows users to easily access biomarkers associated with genes of interest and to discover single-cell expression data associated with specific cancers.
1. By searching for a gene in the search bar on the right side of the module, users are able to obtain the differential expression data for the gene in different cancers and different cell types and the SCancerRNA link of the corresponding biomarker.
2. In the ‘Biomarker in single cell’ module, results can be downloaded in excel or csv format.
3. Users are able to select a cancer type in the cancer drop-down bar on the right to obtain differential expression data for genes associated with the selected cancer at the single-cell level.
4. Input an interested gene or RNA biomarker for search.
5. In the ‘Cancer’ module, results can be downloaded in excel or csv format.
The visualization of detailed statistics of SCancerRNA is provided in the "Statistics" page.
Users can explore the data through visualizations according to their needs.
All the data from the SCancerRNA database can be accessed on the ‘Download’ page.
Users can click the arrow symbol next to the file to download their interested data.
Users need to input their data into corresponding blanks and submit. Users can also select the biological functions and clinical applications of each biomarker to provide more detailed and comprehensive information for SCancerRNA.
We will further curate the submitted information to determine whether to add the new entries to the database or not.