| CARVIEW |
Li Lab
The Li Lab aims at developing novel computational tools to empower single-cell and single-nucleus multi-omics data analysis. Currently, we have three important research directions:
- Cloud-based analysis of large-scale single-cell and single-nucleus RNA-seq datasets;
- Cloud-based large-scale single-cell ATAC-seq data analysis;
- Cloud-based spatial transcriptomics data analysis.
In addition, collaborating with biologists and clinicians, we also work on biological questions related to:
- Human Immune Cell Atlas;
- Human Tumor Cell Atlas;
- COVID-19 research.
Recent News
Welcome co-op students Linhui Chen and Dongyu Zhou joining our lab!
Cumulus featured workspace on Terra is updated
Cumulus is introduced in Pfizer Workshop
The Cumulus paper is published!
Welcome co-op student Hui Ma joining our lab!
Featured Publications
Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq
Massively parallel single-cell and single-nucleus RNA-seq (sc/snRNA-seq) have opened the way to systematic tissue atlases in health and disease, but as the scale of data generation is growing, so does the need for computational pipelines for scaled analysis. Here, we developed Cumulus, a cloud-based framework for analyzing large scale sc/snRNA-seq datasets. Cumulus combines the power of cloud computing with improvements in algorithm implementations to achieve high scalability, low cost, user-friendliness, and integrated support for a comprehensive set of features. We benchmark Cumulus on the Human Cell Atlas Census of Immune Cells dataset of bone marrow cells and show that it substantially improves efficiency over conventional frameworks, while maintaining or improving the quality of results, enabling large-scale studies.
RSEM: Accurate Transcript Quantification from RNA-Seq Data with or without a Reference Genome
Background
RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments.
Results
We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM’s ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene.
Conclusions
RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.
Meet the Team
Principal Investigator
Bo Li
Assistant Professor of Medicine and Director of Bioinformatics and Computational Biology Program
Members
Yiming Yang
Computational Scientist
Dongyu Zhou
Co-Op Student
Hui Ma
Co-Op Student
Kamil Slowikowski
Postdoctoral Fellow
Linhui Chen
Co-Op Student
Recent Publications
Inherited myeloproliferative neoplasm risk affects haematopoietic stem cells
Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq
A Single-Cell and Single-Nucleus RNA-Seq Toolbox for Fresh and Frozen Human Tumors
Thousands of Novel Unannotated Proteins Expand the MHC I Immunopeptidome in Cancer
Linking Indirect Effects of Cytomegalovirus in Transplantation to Modulation of Monocyte Innate Immune Function
Contact
- bli28 [at] mgh [dot] harvard [dot] edu
- CNY 149-8, 149 13th Street, Charlestown, MA 02129, United States