sarek: Parameters

Define where the pipeline should find input data and save output data.

Starting step

required

type: string

Path to comma-separated file containing information about the samples in the experiment.

type: string

pattern: \.csv$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Most common options used for the pipeline

Specify how many reads each split of a FastQ file contains. Set 0 to turn off splitting at all.

type: integer

default: 50000000

Enable when exome or panel data is provided.

type: boolean

Path to target bed file in case of whole exome or targeted sequencing or intervals file.

type: string

Estimate interval size.

type: number

default: 1000

Disable usage of intervals.

type: boolean

Tools to use for variant calling and/or for annotation.

type: string

pattern:

^((ascat|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|manta|merge|mpileup|msisensorpro|mutect2|snpeff|strelka|tiddit|vep)?,?)*[^,]+$

Disable specified tools.

type: string

pattern:

^((baserecalibrator|baserecalibrator_report|bcftools|documentation|fastqc|markduplicates|markduplicates_report|mosdepth|multiqc|samtools|vcftools|versions)?,?)*[^,]+$

Trim fastq file or handle UMIs

Run FastP for read trimming

hidden

type: boolean

Remove bp from the 5’ end of read 1

hidden

type: integer

Remove bp from the 5’ end of read 2

hidden

type: integer

Remove bp from the 3’ end of read 1

hidden

type: integer

Remove bp from the 3’ end of read 2

hidden

type: integer

Removing poly-G tails.

hidden

type: integer

Save trimmed FastQ file intermediates.

hidden

type: boolean

Specify UMI read structure

hidden

type: string

Default strategy with UMI

hidden

type: string

default: Adjacency

If set, publishes split FASTQ files. Intended for testing purposes.

hidden

type: boolean

Configure preprocessing tools

Specify aligner to be used to map reads to reference genome.

hidden

type: string

Save mapped BAMs.

type: boolean

Saves output from Markduplicates & Baserecalibration as BAM file instead of CRAM

type: boolean

Enable usage of GATK Spark implementation for duplicate marking and/or base quality score recalibration

type: string

pattern: ^((baserecalibrator|markduplicates)?,?)*[^,]+$

Configure variant calling tools

If true, skips germline variant calling for matched normal to tumor sample. Normal samples without matched tumor will still be processed through germline variant calling tools.

type: boolean

Turn on the joint germline variant calling for GATK haplotypecaller

type: boolean

Overwrite Ascat min base quality required for a read to be counted.

hidden

type: number

default: 20

Overwrite Ascat minimum depth required in the normal for a SNP to be considered.

hidden

type: number

default: 10

Overwrite Ascat min mapping quality required for a read to be counted.

hidden

type: number

default: 35

Overwrite ASCAT ploidy.

hidden

type: number

Overwrite ASCAT purity.

hidden

type: number

Specify a custom chromosome length file.

hidden

type: string

Overwrite Control-FREEC coefficientOfVariation

hidden

type: number

default: 0.05

Overwrite Control-FREEC contaminationAdjustement

hidden

type: boolean

Design known contamination value for Control-FREEC

hidden

type: number

Minimal sequencing quality for a position to be considered in BAF analysis.

hidden

type: number

Minimal read coverage for a position to be considered in BAF analysis.

hidden

type: number

Genome ploidy used by ControlFREEC

hidden

type: string

default: 2

Overwrite Control-FREEC window size.

hidden

type: number

Panel-of-normals VCF (bgzipped) for GATK Mutect2

hidden

type: string

Index of PON panel-of-normals VCF.

hidden

type: string

Do not analyze soft clipped bases in the reads for GATK Mutect2.

hidden

type: boolean

Allow usage of fasta file for annotation with VEP

hidden

type: boolean

Enable the use of the VEP dbNSFP plugin.

hidden

type: boolean

Path to dbNSFP processed file.

hidden

type: string

Path to dbNSFP tabix indexed file.

hidden

type: string

Consequence to annotate with

hidden

type: string

Fields to annotate with

hidden

type: string

default: rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF

Enable the use of the VEP LOFTEE plugin.

hidden

type: boolean

Enable the use of the VEP SpliceAI plugin.

hidden

type: boolean

Path to spliceai raw scores snv file.

hidden

type: string

Path to spliceai raw scores snv tabix indexed file.

hidden

type: string

Path to spliceai raw scores indel file.

hidden

type: string

Path to spliceai raw scores indel tabix indexed file.

hidden

type: string

Enable the use of the VEP SpliceRegion plugin.

hidden

type: boolean

Path to snpEff cache.

hidden

type: string

Path to VEP cache.

hidden

type: string

VEP output-file format.

hidden

type: string

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

default: GATK.GRCh38

ASCAT genome.

hidden

type: string

Path to ASCAT allele zip file.

hidden

type: string

Path to ASCAT loci zip file.

hidden

type: string

Path to ASCAT GC content correction file.

hidden

type: string

Path to ASCAT RT (replictiming) correction file.

hidden

type: string

Path to BWA mem indices.

hidden

type: string

Path to bwa-mem2 mem indices.

hidden

type: string

Path to chromosomes folder used with ControLFREEC.

hidden

type: string

Path to dbsnp file.

hidden

type: string

Path to dbsnp index.

hidden

type: string

label string for VariantRecalibration (haplotypecaller joint variant calling)

type: string

Path to FASTA dictionary file.

hidden

type: string

Path to dragmap indices.

hidden

type: string

Path to FASTA genome file.

type: string

pattern: \.fn?a(sta)?(\.gz)?$

Path to FASTA reference index.

type: string

Path to GATK Mutect2 Germline Resource File.

hidden

type: string

Path to GATK Mutect2 Germline Resource Index.

hidden

type: string

Path to known indels file.

hidden

type: string

Path to known indels file index.

hidden

type: string

If you use AWS iGenomes, this has already been set for you appropriately.

1st label string for VariantRecalibration (haplotypecaller joint variant calling)

type: string

If you use AWS iGenomes, this has already been set for you appropriately.

Path to known snps file.

type: string

Path to known snps file snps.

type: string

If you use AWS iGenomes, this has already been set for you appropriately.

label string for VariantRecalibration (haplotypecaller joint variant calling)

type: string

Path to Control-FREEC mappability file.

hidden

type: string

snpEff DB version.

hidden

type: string

snpEff genome.

hidden

type: string

snpEff version.

hidden

type: string

VEP genome.

hidden

type: string

VEP species.

hidden

type: string

VEP cache version.

hidden

type: number

VEP version.

hidden

type: string

Save built references.

type: boolean

Directory / URL base for iGenomes references.

type: string

default: s3://ngi-igenomes/igenomes/

Do not load the iGenomes reference config.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Sequencing center information to be added to read group (CN field).

hidden

type: string

Sequencing platform information to be added to read group (PL field).

hidden

type: string

default: ILLUMINA

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

pattern: ^(\d+\.?\s*(s|m|h|day)\s*)+$

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Directory to keep pipeline Nextflow logs and reports.

hidden

type: string

default: ${params.outdir}/pipeline_info

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Show all params when using --help

hidden

type: boolean

Run this workflow with Conda. You can also use ‘-profile conda’ instead of providing this parameter.

hidden

type: boolean

nf-core/sarek