use random forest and boost trees to find marker genes in scRNAseq data

This is a blog post for a series of posts on marker gene identification using machine learning methods. Read the previous posts: logistic regression and partial least square regression. This blog post will explore the tree based method: random forest and boost trees (gradient boost tree/XGboost). I highly recommend going through for related sections by Josh Starmer. Note, all the tree based methods can be used to do both classification and regression.

customize FeaturePlot in Seurat for multi-condition comparisons using patchwork

Seurat is great for scRNAseq analysis and it provides many easy-to-use ggplot2 wrappers for visualization. However, this brings the cost of flexibility. For example, In FeaturePlot, one can specify multiple genes and also to further split to multiple the conditions in the If is not NULL, the ncol is ignored so you can not arrange the grid. This is best to understand with an example. library(dplyr) library(Seurat) library(patchwork) library(ggplot2) # Load the PBMC dataset pbmc.