Choosing the Right Workflow#

WASP2 supports four major data types. Use this guide to find your workflow.

Data Type

Input

Goal

Start Here

Bulk RNA-seq

BAM + phased VCF

Allele-specific expression (ASE)

Bulk Workflow (RNA-seq / ATAC-seq)

Bulk ATAC-seq

BAM + phased VCF

Allele-specific chromatin accessibility

Bulk Workflow (RNA-seq / ATAC-seq)

scRNA-seq (10x)

Cell Ranger BAM + VCF + barcodes

Per-cell or per-cell-type ASE

Single-Cell Workflow (scRNA-seq / scATAC-seq)

scATAC-seq (10x)

Fragments/BAM + VCF + barcodes

Single-cell allelic imbalance in ATAC peaks

Single-Cell Workflow (scRNA-seq / scATAC-seq)

Decision Flowchart#

Step 1: What sequencing assay did you run?

  • RNA-seq → go to Step 2

  • ATAC-seq → go to Step 3

Step 2: Bulk or single-cell RNA-seq?

Step 3: Bulk or single-cell ATAC-seq?

Do I Need to Run the WASP Remapping Step?#

The remapping step (wasp2-map) corrects reference mapping bias — reads carrying the alternative allele are harder to map than reference-allele reads, causing false-positive imbalance signals.

You need remapping if:

  • Your BAM was aligned with a standard aligner (BWA-MEM, STAR, HISAT2, bowtie2)

  • You want the most rigorous allele-specific analysis

  • You are studying regions near known variants (high variant density)

You can skip remapping if:

  • Your BAM was already produced by an unbiased pipeline

  • You are doing a quick exploratory analysis

  • You are using simulated or controlled data

See Mapping Module for the full remapping workflow.

What VCF Do I Need?#

WASP2 requires a phased VCF with heterozygous variants for the sample(s) you are analyzing. Supported formats:

  • VCF/BCF (bgzip + tabix indexed)

  • PLINK2 PGEN files (with .pvar + .psam)

See Counting Module for VCF format requirements and examples using bcftools to subset and phase.

Nextflow Pipelines#

If you prefer a managed workflow with automatic parallelization, containerization, and output publishing, use the bundled Nextflow pipelines instead of the CLI:

  • nf-rnaseq — bulk RNA-seq allele-specific expression

  • nf-atacseq — bulk ATAC-seq allele-specific chromatin accessibility

  • nf-scatac — single-cell ATAC-seq allelic imbalance

  • nf-outrider — outlier expression detection with allele-aware correction

See the pipeline-specific documentation for samplesheet format and parameter reference.