nf-core/taxprofiler
Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
1.0.0
). The latest
stable release is
1.2.3
.
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples and libraries/runs.
string
^\S+\.(csv)$
Path to comma-separated file containing information about databases and profiling parameters for each taxonomic profiler
string
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
MultiQC report title. Printed as page header, used for filename if not otherwise specified.
string
Common options across both long and short read preprocessing QC steps
Specify the tool used for quality control of raw sequencing reads
string
Save reads from samples that went through the adapter clipping, pair-merging, and length filtering steps for both short and long reads
boolean
Options for adapter clipping, quality trimming, pair-merging, and complexity filtering
Turns on short read quality control steps (adapter clipping, complexity filtering etc.)
boolean
Specify which tool to use for short-read QC
string
Skip adapter trimming
boolean
Specify adapter 1 nucleotide sequence
string
None
Specify adapter 2 nucleotide sequence
string
None
Specify a list of all possible adapters to trim. Overrides —shortread_qc_adapter1/2. Formats: .txt (AdapterRemoval) or .fasta. (fastp).
string
None
Turn on merging of read pairs for paired-end data
boolean
Include unmerged reads from paired-end merging in the downstream analysis
boolean
Specify the minimum length of reads to be retained
integer
15
Turns on nucleotide sequence complexity filtering
boolean
Specify which tool to use for complexity filtering
string
Specify the minimum sequence entropy level for complexity filtering
number
0.3
Specify the window size for BBDuk complexity filtering
integer
50
Turn on masking rather than discarding of low complexity reads for BBduk
boolean
Specify the minimum complexity filter threshold of fastp
integer
30
Specify the complexity filter mode for PRINSEQ++
string
Specify the minimum dust score for PRINTSEQ++ complexity filtering
number
0.5
Save reads from samples that went through the complexity filtering step
boolean
Options for adapter clipping, quality trimming, and length filtering
Turns on long read quality control steps (adapter clipping, length filtering etc.)
boolean
Skip long-read trimming
boolean
Skip long-read length and quality filtering
boolean
Specify the minimum length of reads to be retained
integer
1000
Specify the percent of high-quality bases to be retained
integer
90
Specify the number of high-quality bases in the library to be retained
integer
500000000
Options for pre-profiling host read removal
Turn on short-read host removal
boolean
Turn on long-read host removal
boolean
Specify path to single reference FASTA of host(s) genome(s)
string
None
Specify path to the directory containing pre-made BowTie2 indexes of the host removal reference
string
None
Specify path to a pre-made Minimap2 index file (.mmi) of the host removal reference
string
None
Save mapping index of input reference when not already supplied by user
boolean
Saved mapped and unmapped reads in BAM format from host removal
boolean
Save reads from samples that went through the host-removal step
boolean
Options for per-sample run-merging
Turn on run merging
boolean
Save reads from samples that went through the run-merging step
boolean
Turn on profiling with Centrifuge. Requires database to be present CSV file passed to —databases
boolean
Turn on saving of Centrifuge-aligned reads
boolean
Turn on profiling with DIAMOND. Requires database to be present CSV file passed to —databases
boolean
Specify output format from DIAMOND profiling.
string
Turn on saving of DIAMOND-aligned reads. Will override —diamond_output_format and no taxon tables will be generated
boolean
Turn on profiling with Kaiju. Requires database to be present CSV file passed to —databases
boolean
Specify taxonomic rank to be displayed in Kaiju taxon table
string
Turn on profiling with Kraken2. Requires database to be present CSV file passed to —databases
boolean
Turn on saving of Kraken2-aligned reads
boolean
Turn on saving of Kraken2 per-read taxonomic assignment file
boolean
Turn on saving minimizer information in the kraken2 report thus increasing to an eight column layout.
boolean
Turn on profiling with KrakenUniq. Requires database to be present CSV file passed to —databases
boolean
Turn on saving of KrakenUniq-aligned reads
boolean
Specify how large to chunk database when loading into memory for KrakenUniq
string
16G
Turn on saving of KrakenUniq per-read taxonomic assignment file
boolean
Turn on Bracken (and the required Kraken2 prerequisite step).
boolean
Turn on profiling with MALT. Requires database to be present CSV file passed to —databases
boolean
Specify which MALT alignment mode to use
string
BlastN
Turn on saving of MALT-aligned reads
boolean
Turn on generation of MEGAN summary file from MALT results
boolean
Turn on profiling with MetaPhlAn3. Requires database to be present CSV file passed to —databases
boolean
Turn on profiling with mOTUs. Requires database to be present CSV file passed to —databases
boolean
Turn on printing relative abundance instead of counts.
boolean
Turn on saving the mgc reads count.
boolean
Turn on removing NCBI taxonomic IDs.
boolean
Turn on standardisation of taxon tables across profilers
boolean
Turn on generation of BIOM output (currently only applies to mOTUs)
boolean
Turn on generation of Krona plots for supported profilers
boolean
Specify path to krona taxonomy directories (required for MALT krona plots)
string
None
The desired output format.
string
The path to a directory containing taxdump files.
string
Add the taxon name to the output.
boolean
Add the taxon rank to the output.
boolean
Add the taxon’s entire name lineage to the output.
boolean
Add the taxon’s entire ID lineage to the output.
boolean
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Maximum amount of memory that can be requested for any single job.
string
128.GB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Maximum amount of time that can be requested for any single job.
string
240.h
^(\d+\.?\s*(s|m|h|day)\s*)+$
Less common options for the pipeline, typically set in a config file.
Display help text.
boolean
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Send plain-text email instead of HTML.
boolean
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Custom config file to supply to MultiQC.
string
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
string
Custom MultiQC yaml file containing HTML including a methods description.
string
Directory to keep pipeline Nextflow logs and reports.
string
${params.outdir}/pipeline_info
Boolean whether to validate parameters against the schema at runtime
boolean
true
Show all params when using --help
boolean
Reference genome related files and options required for the workflow.
Name of iGenomes reference.
string
Directory / URL base for iGenomes references.
string
s3://ngi-igenomes/igenomes
Do not load the iGenomes reference config.
boolean