ChIP-seq workflowΒΆ

The ChIP-seq workflow starts with raw FASTQ files and performs various QC steps. It aligns and prepares BAM and bigWig files, performs peak-calling, and combines everything together into a track hub for visualization.

Specifically, the workflow does the following:

  • trims reads with cutadapt

  • maps reads with Bowtie2

  • runs FastQC on raw, trimmed, and aligned reads

  • Removes multimappers (samtools) and duplicates (Picard MarkDuplicates)

  • performs fastq_screen on multiple configured genomes to look for evidence of cross-contamination

  • QC aggregation using MultiQC, along with a custom table for library sizes

  • merges technical replicates and then re-deduplicates them

  • creates bigWigs from unique, no-dups BAM files

  • optionally merges bigWigs to create one signal track for all replicates

  • runs deepTools plotFingerprint on grouped IP and input for QC and evaluation of enrichment

  • calls peaks using macs2, spp, and/or sicer, with support for multiple peak-calling runs using different parameters to assist with assessing performance and to help make decisions for downstream analysis

  • optionally runs a template diffBind RMarkdown file used for differential binding analysis

  • converts BED files into bigBed (or bigNarrowPeak where possible)

  • builds and optionally uploads a track hub of bigWigs and bigBeds to visualize peak-calling in UCSC Genome Browser

To configure a ChIP-seq experiment, see Config YAML.

_images/chipseq.png