Learn how to tame big data:
tips and tricks for data analysis

    We respect your privacy. Unsubscribe at anytime.

    Selected Publications

    . Monkeybread: A Python toolkit for the analysis of cellular niches in single-cell resolution spatial transcriptomics data. In BioRxiv, 2023.

    Preprint

    . The histologic phenotype of lung cancers may be driven by transcriptomic features rather than genomic characteristics. in Nature Communications, 2021.

    Preprint Paper link

    . Reprogramming of bivalent chromatin states in NRAS mutant melanoma suggests PRC2 inhibition as a therapeutic strategy. In Cell Reports, 2021.

    PDF Code

    . HieRFIT: A hierarchical cell type classification tool for projections from complex single-cell atlas datasets . In Bioinformatics, 2021.

    Preprint Code Paper Link

    . CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data. In F1000Research, 2021.

    Preprint Code link to F1000

    . Evaluating single-cell cluster stability using the Jaccard similarity index. In Bioinformatics, 2020.

    Preprint Code Dataset Paper Link

    . Enhancer Reprogramming Confers Dependence on Glycolysis and IGF Signaling in KMT2D Mutant Melanoma. In Cell Reports, 2020.

    PDF Code

    . Integrative analyses of single-cell transcriptome and regulome using MAESTRO. In Genome Biology, 2020.

    PDF Code

    . KMT2D Deficiency Impairs Super-Enhancers to Confer a Glycolytic Vulnerability in Lung Cancer. In Cancer Cell, Highlight in ScienceSignaling: Tumor’s loss is clinician’s gain, 2020.

    PDF Code

    . Fast analysis of scATAC-seq data using a predefined set of genomic regions. In F1000 Research, 2020.

    PDF Code Source Document

    . Integrative Analysis Identifies Four Molecular and Clinical Subsets in Uveal Melanoma. In Cancer Cell, 2017.

    PDF

    . ChIP-seq analysis Book chapter for Biostar Handbook. In Biostar Handbook, 2017.

    PDF

    . Systematic analysis of telomere length and somatic alterations in 31 cancer types. In Nature Genetics, 2017.

    PDF

    . A Molecular Take on Malignant Rhabdoid Tumors. In Trends in Cancer, 2016.

    PDF

    Recent Publications

    . A community challenge to predict clinical outcomes after immune checkpoint blockade in non-small cell lung cancer. In Journal of Translational Medicine, 2024.

    Paper Link

    . Monkeybread: A Python toolkit for the analysis of cellular niches in single-cell resolution spatial transcriptomics data. In BioRxiv, 2023.

    Preprint

    . The tidyomics ecosystem: Enhancing omic data analyses. In BioRxiv, 2023.

    Preprint

    . Comprehensive Characterizations of Immune Receptor Repertoire in Tumors and Cancer Immunotherapy Studies. In Cancer Immunology Research, 2022.

    Paper Link

    . Immunogenomic intertumor heterogeneity across primary and metastatic sites in a patient with lung adenocarcinoma. In Journal of Experimental & Clinical Cancer Research, 2022.

    Paper Link

    . The histologic phenotype of lung cancers may be driven by transcriptomic features rather than genomic characteristics. in Nature Communications, 2021.

    Preprint Paper link

    . Fast alignment and preprocessing of chromatin profiles with Chromap. In Nature Communications, 2021.

    Paper Link

    . Enhancer reprogramming in PRC2-deficient malignant peripheral nerve sheath tumors induces a targetable de-differentiated state. In Acta Neuropathol, 2021.

    Pubmed link

    . Reprogramming of bivalent chromatin states in NRAS mutant melanoma suggests PRC2 inhibition as a therapeutic strategy. In Cell Reports, 2021.

    PDF Code

    . HieRFIT: A hierarchical cell type classification tool for projections from complex single-cell atlas datasets . In Bioinformatics, 2021.

    Preprint Code Paper Link

    . CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data. In F1000Research, 2021.

    Preprint Code link to F1000

    . Evaluating single-cell cluster stability using the Jaccard similarity index. In Bioinformatics, 2020.

    Preprint Code Dataset Paper Link

    . Enhancer Reprogramming Confers Dependence on Glycolysis and IGF Signaling in KMT2D Mutant Melanoma. In Cell Reports, 2020.

    PDF Code

    . Integrative analyses of single-cell transcriptome and regulome using MAESTRO. In Genome Biology, 2020.

    PDF Code

    . KMT2D Deficiency Impairs Super-Enhancers to Confer a Glycolytic Vulnerability in Lung Cancer. In Cancer Cell, Highlight in ScienceSignaling: Tumor’s loss is clinician’s gain, 2020.

    PDF Code

    . Fast analysis of scATAC-seq data using a predefined set of genomic regions. In F1000 Research, 2020.

    PDF Code Source Document

    . LATS kinase–mediated CTCF phosphorylation and selective loss of genomic binding. In Science Advances, 2020.

    PDF

    . KRAS-IRF2 Axis Drives Immune Suppression and Immune Therapy Resistance in Colorectal Cancer. In Cancer Cell, 2019.

    PDF

    . The Tandem Duplicator Phenotype Is a Prevalent Genome-Wide Cancer Configuration Driven by Distinct Gene Mutations. In Cancer Cell, 2018.

    PDF

    . TumorFusions: an integrative resource for cancer-associated transcript fusions. In Nucleic Acids Research, 2018.

    PDF

    . Integrative Analysis Identifies Four Molecular and Clinical Subsets in Uveal Melanoma. In Cancer Cell, 2017.

    PDF

    . ChIP-seq analysis Book chapter for Biostar Handbook. In Biostar Handbook, 2017.

    PDF

    . Systematic analysis of telomere length and somatic alterations in 31 cancer types. In Nature Genetics, 2017.

    PDF

    . A Molecular Take on Malignant Rhabdoid Tumors. In Trends in Cancer, 2016.

    PDF

    . TRIM29 Suppresses TWIST1 and Invasive Breast Cancer Behavior. In Cancer Research, 2014.

    PDF

    Talks & Teachings

    Single-cell RNAseq data integration: methods and challenges
    Jan 24, 2024 11:00 AM
    workshop: Charting human biology using AI for Precision Health & Precision Medicine
    Nov 16, 2023 8:30 AM
    Navigating Omics Data Analysis Challenges in Biotech
    Nov 9, 2023 10:00 AM
    Drug Discovery Biology Strategy Meeting
    Nov 7, 2023 3:30 PM
    Panel Discussion on GPT & Drug Discovery: Rise of Generative Models
    Oct 26, 2023 2:00 PM
    How has computational biology impacted the field of immuno-oncology
    Oct 25, 2023 12:00 PM
    Unleashing the power of computational biology
    Oct 11, 2023 12:00 PM
    Pathfinder podcast
    Oct 10, 2023 12:00 PM
    How To Switch From Wet-Lab To Dry-Lab? Tips for single cell scientists
    Sep 2, 2023 12:00 PM
    Dissecting myeloid and T cells interaction niches in the TME using spatial transcriptome
    Jul 18, 2023 9:00 AM
    Patient Participation: The Unsung Hero of Drug Development
    Jul 7, 2023 12:00 PM
    Single-cell analysis: best practices and unsolved problems
    Jun 9, 2023 3:15 AM
    Learn Computational Biology the Right Way
    Mar 18, 2023 11:15 AM
    Reproducible genomic data science
    Jul 9, 2022 2:30 PM
    Single-Cell Analysis 101: Current Approaches and Open Questions
    Mar 29, 2022 12:30 PM
    Annual postdoc Alternative careers event King's College London
    Mar 9, 2022 6:00 AM
    Career mentoring for Department of Genomic Medicine, MD Anderson Cancer Center
    Jan 3, 2022 11:00 AM
    Reproducible research in genomic data science
    Sep 12, 2021 6:00 AM
    Fast analysis of scATAC-seq data using a predefined set of genomic regions
    Apr 23, 2021 10:00 AM
    STAT115 single cell ATACseq lecture
    Mar 30, 2021 12:00 PM
    reproducible computing for your own benifit
    Jun 2, 2020 11:00 AM
    Tools and tricks for a data scientist
    Mar 9, 2020 11:00 AM
    Harvard FAS informatics nanocourse
    Aug 19, 2019 10:40 AM
    Reproducible research in bioinformatics
    Mar 28, 2019 2:30 PM
    2018 Next-Gen Sequence Analysis Workshop
    Jul 10, 2018 10:00 AM
    GS01 1143 Introduction to Bioinformatics course
    Nov 2, 2017 9:00 AM
    University of Miami 2-day Software Carpentry workshop
    Apr 2, 2015 9:00 AM

    Recent Posts

    More Posts

    Projects

    CANCER IMMUNOLOGIC DATA COMMONS(CIDC)

    The Cancer Immunologic Data Commons (CIDC), hosted by Dana-Farber Cancer Institute, will serve the bioinformatics needs of the network, optimization of data collection methodologies suitable for immune-related biomarkers, data integration and building a biomarker database for the secondary use by the large immuno-oncology community.

    Snakemake pipeline post-processing scATAC-seq

    A snakemake pipeline to split scATACseq bam by cluster, make bigwig tracks, call peaks and recount

    Evaluating single cell RNAseq cluster stability

    An R package for evaluating and visualizing scRNAseq cluster stability

    Snakemake Pipelines

    Snakemake pipelines for processing high-throughput sequencing data

    Newsletter

    Join my FREE newsletter here to learn bioinformatics and computational biology.

    Contact