Choosing the Right Workflow#
WASP2 supports four major data types. Use this guide to find your workflow.
Data Type |
Input |
Goal |
Start Here |
|---|---|---|---|
Bulk RNA-seq |
BAM + phased VCF |
Allele-specific expression (ASE) |
|
Bulk ATAC-seq |
BAM + phased VCF |
Allele-specific chromatin accessibility |
|
scRNA-seq (10x) |
Cell Ranger BAM + VCF + barcodes |
Per-cell or per-cell-type ASE |
|
scATAC-seq (10x) |
Fragments/BAM + VCF + barcodes |
Single-cell allelic imbalance in ATAC peaks |
Decision Flowchart#
Step 1: What sequencing assay did you run?
RNA-seq → go to Step 2
ATAC-seq → go to Step 3
Step 2: Bulk or single-cell RNA-seq?
Bulk RNA-seq → Bulk Workflow (RNA-seq / ATAC-seq)
10x Chromium scRNA-seq → Single-Cell Workflow (scRNA-seq / scATAC-seq)
Other single-cell protocol → see Single-Cell Analysis
Step 3: Bulk or single-cell ATAC-seq?
Bulk ATAC-seq → Bulk Workflow (RNA-seq / ATAC-seq) (use BED peak file as
--region)10x scATAC-seq → Single-Cell Workflow (scRNA-seq / scATAC-seq)
Do I Need to Run the WASP Remapping Step?#
The remapping step (wasp2-map) corrects reference mapping bias — reads
carrying the alternative allele are harder to map than reference-allele reads,
causing false-positive imbalance signals.
You need remapping if:
Your BAM was aligned with a standard aligner (BWA-MEM, STAR, HISAT2, bowtie2)
You want the most rigorous allele-specific analysis
You are studying regions near known variants (high variant density)
You can skip remapping if:
Your BAM was already produced by an unbiased pipeline
You are doing a quick exploratory analysis
You are using simulated or controlled data
See Mapping Module for the full remapping workflow.
What VCF Do I Need?#
WASP2 requires a phased VCF with heterozygous variants for the sample(s) you are analyzing. Supported formats:
VCF/BCF (bgzip + tabix indexed)
PLINK2 PGEN files (with
.pvar+.psam)
See Counting Module for VCF format requirements and examples using bcftools to subset and phase.
Nextflow Pipelines#
If you prefer a managed workflow with automatic parallelization, containerization, and output publishing, use the bundled Nextflow pipelines instead of the CLI:
nf-rnaseq — bulk RNA-seq allele-specific expression
nf-atacseq — bulk ATAC-seq allele-specific chromatin accessibility
nf-scatac — single-cell ATAC-seq allelic imbalance
nf-outrider — outlier expression detection with allele-aware correction
See the pipeline-specific documentation for samplesheet format and parameter reference.