I am a computational biologist working on genomics, epigenomics and transcriptomics. I use R primary for data wrangling and visualization in the tidyverse ecosystem; I use python for writing Snakemake workflows and reformatting data; I am a unix geek learning shell tricks almost every month; I care about reproducible research and open science.
I also have a great interest in promoting open science and reforming bioinformatics education. I frequently share my thoughts on twitter and tips in my blog post. I am a certified instructor for the carpentries.
Being trained in a wet lab in the University of Florida during my PhD in Dr.Jianrong Lu’s lab has established my solid knowledge and skills in experimental molecular cancer biology. Self-teaching and postdoctoral training in Dr.Roel Verhaak’s lab has extended my bioinformatics skills in integrating analysis of TB size sequencing data sets. Verhaak lab is well known for studying genomic alterations of brain tumor by analyzing large panels of RNA-seq and DNA-seq data. I gained extensive experience in handling large-scale genomic data and pipelining workflows. I also gained intimate familiarities with public data sets such as ENCODE, TCGA and CCLE. I have put my analysis notes and snakemake pipelines for processing whole-exome, whole-genome DNAseq, RNAseq, single-cell RNAseq, ChIP-seq, ATACseq and RRBS data in my github repos.
Roel moved to Jackson lab for Genomics Medicine in October 2016. To apply my skills to translational lung cancer research, I joined Dr. Andrew Futreal and Dr.Jianjun Zhang’s lab shortly to study tumor heterogeneity in lung cancer by analyzing multi-region whole-exome sequencing data and DNA methylation array data. I have been working as a research scientist in MD Anderson at Kunal Rai’s lab studying cancer epigenomics.
I joined Harvard Faculty of Arts and Sciences informatics team on October 1, 2018 as a bioinformatics scientist working closely with Dr.Dulac lab to catalog and understand the diversity and function of cell types in the mouse brain using single-cell RNA-seq and other cutting edge techniques.
I am now a senior scientist in Dana-Farber Cancer Institute. My current focus is using genomics, (single-cell) transcriptomics and (single-cell) epigenomics to study Cancer Immunology and leading the bioinformatics effort for the Cancer Immunologic Data Commons (CIDC) project.
PhD in Genetics and Genomics, 2014
University of Florida
BS in Biotechnology, 2008
Shanghai Jiaotong University
A snakemake pipeline to split scATACseq bam by cluster, make bigwig tracks, call peaks and recount
An R package for evaluating and visualizing scRNAseq cluster stability
Find my CV in PDF here.