clustered dotplot for single-cell RNAseq

Dotplot is a nice way to visualize scRNAseq expression data across clusters. It gives information (by color) for the average expression level across cells within the cluster and the percentage (by size of the dot) of the cells express that gene within the cluster. Seurat has a nice function for that. However, it can not do the clustering for the rows and columns. David McGaughey has written a blog post using ggplot2 and ggtree from Guangchuang Yu.

stacked violin plot for visualizing single-cell data in Seurat

In scanpy, there is a function to create a stacked violin plot. There is no such function in Seurat, and many people were asking for this feature. e.g. The developers have not implemented this feature yet. In this post, I am trying to make a stacked violin plot in Seurat. The idea is to create a violin plot per gene using the VlnPlot in Seurat, then customize the axis text/tick and reduce the margin for each plot and finally concatenate by cowplot::plot_grid or patchwork::wrap_plots.

Align multiple ggplot2 plots by axis

I used to use cowplot to align multiple ggplot2 plots but when the x-axis are of different ranges, some extra work is needed to align the axis as well. The other day I was reading a blog post by GuangChuang Yu and he exactly tackled this problem. His packages such as ChIPseeker, ClusterProfiler, ggtree are quite popular among the users. Some dummy example from his post: library(dplyr) library(ggplot2) library(ggstance) library(cowplot) # devtools::install_github("YuLab-SMU/treeio") # devtools::install_github("YuLab-SMU/ggtree") library(tidytree) library(ggtree) no_legend=theme(legend.

My opinionated selection of books/urls for bioinformatics/data science curriculum

There was a paper on this topic: A New Online Computational Biology Curriculum. I am going to provide a biased list below (I have read most of the books if not all). I say it is biased because you will see many books of R are from Hadely Wickham. I now use tidyverse most of the time. Unix I suggest people who want to learn bioinformatics starting to learn unix commands first.

plot 10x scATAC coverage by cluster/group

This post was inspired by Andrew Hill’s recent blog post. Inspired by some nice posts by @timoast and @tangming2005 and work from @10xGenomics. Would still definitely have to split BAM files for other tasks, so easy to use tools for that are super useful too! — Andrew J Hill (@ahill_tweets) April 13, 2019 Andrew wrote that blog post in light of my other recent blog post and Tim’s (developer of the almighty Seurat package) blog post.

A tale of two heatmap functions

You probably do not understand heatmap! Please read You probably don’t understand heatmaps by Mick Watson In the blog post, Mick used heatmap function in the stats package, I will try to walk you through comparing heatmap, and heatmap.2 from gplots package. Before I start, I want to quote this: “The defaults of almost every heat map function in R does the hierarchical clustering first, then scales the rows then displays the image”