nf-core/funcscan
(Meta-)genome screening for functional and natural product gene sequences
1.1.4). The latest
stable release is
3.0.0
.
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information sample names and paths to corresponding FASTA files.
string^\S+\.csv$The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$MultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringThese parameters influence which workflow (ARG, AMP and/or BGC) to activate.
Activate antimicrobial peptide screening tools.
booleanActivate antimicrobial resistance gene screening tools.
booleanActivate biosynthetic gene cluster screening tools.
booleanThese options influence the generation of annotation files required for downstream steps in ARG, AMP, and BGC workflows.
Specify which annotation tool to use for some downstream tools.
stringSpecify whether to save gene annotations in the results directory.
booleanThese parameters influence the annotation algorithm of Bacteria used by BAKTA.
Specify a path to BAKTA database.
stringDownload full or light version of the Bakta database if not supplying own database.
stringSpecify the minimum contig size.
integer1Specify the genetic code translation table.
integer11Specify the type of bacteria to be annotated to detect signaling peptides.
stringSpecify that all contigs are complete replicons.
booleanChanges the original contig headers.
booleanClean the result annotations to standardise them to Genbank/ENA conventions.
booleanActivate tRNA detection & annotation.
booleanActivate tmRNA detection & annotation.
booleanActivate rRNA detection & annotation.
booleanActivate ncRNA detection & annotation.
booleanActivate ncRNA region detection & annotation.
booleanActivate CRISPR array detection & annotation.
booleanSkip CDS detection & annotation.
booleanActivate pseudogene detection & annotation.
booleanSkip sORF detection & annotation.
booleanActivate gap detection & annotation.
booleanActivate oriC/oriT detection & annotation.
booleanActivate generation of circular genome plots.
booleanThese parameters influence the annotation algorithm used by Prokka.
Use the default genome-length optimised mode (rather than the metagenome mode).
booleanSuppress the default clean-up of the gene annotations.
booleanSpecify the kingdom that the input represents.
stringSpecify the translation table used to annotate the sequences.
integer11Minimum contig size required for annotation (bp).
integer1Minimum e-value cut-off.
number0.000001Set the assigned minimum coverage.
integer80Allow transfer RNA (trRNA) to overlap coding sequences (CDS).
booleanUse RNAmmer for rRNA prediction.
booleanForce contig name to Genbank/ENA/DDJB naming rules.
booleanAdd the gene features for each CDS hit.
booleanRetains contig names.
booleanThese parameters influence the annotation algorithm used by Prodigal.
Specify whether to use Prodigal’s single-genome mode for long sequences.
booleanDoes not allow partial genes on contig edges.
booleanSpecifies the translation table used for gene annotation.
integer11Forces Prodigal to scan for motifs.
booleanThese parameters influence the annotation algorithm used by Pyrodigal.
Specify whether to use Pyrodigal’s single-genome mode for long sequences.
booleanDoes not allow partial genes on contig edges.
booleanSpecifies the translation table used for gene annotation.
integer11Forces Pyrodigal to scan for motifs.
booleanGeneric options for database downloading
Specify whether to save pipeline-downloaded databases in your results directory.
booleanAntimicrobial Peptide detection using a deep learning model.
Skip AMPlify during AMP-screening.
booleanAntimicrobial Peptide detection using machine learning
Skip AMPir during AMP-screening.
booleanSpecify which machine learning classification model to use.
stringSpecify minimum protein length for prediction calculation.
integer10Antimicrobial Peptide detection based on predefined HMM models
Skip HMMsearch during AMP-screening.
booleanSpecify path to the AMP hmm model file(s) to search against. Must have quotes if wildcard used.
stringSaves a multiple alignment of all significant hits to a file.
booleanSave a simple tabular file summarising the per-target output.
booleanSave a simple tabular file summarising the per-domain output.
booleanAntimicrobial Peptide detection mining from metagenomes
Skip Macrel during AMP-screening.
booleanAntiMicrobial Peptides parsing and functional classification tool
Path to AMPcombi reference database directory (DRAMP).
stringSpecify probability cutoff to filter AMPs
number0.4Antimicrobial resistance gene detection based on NCBI’s curated Reference Gene Database and curated collection of Hidden Markov Models
Skip AMRFinderPlus during the ARG-screening.
booleanSpecify the path to a local version of the ARMfinderPlus database.
stringMinimum percent identity to reference sequence.
number-1Minimum coverage of the reference protein.
number0.5Specify which NCBI genetic code to use for translated BLAST.
integer11Add the plus genes to the report.
booleanAdd identified column to AMRFinderPlus output.
booleanAntimicrobial resistance gene detection using a deep learning model
Skip DeepARG during the ARG-screening.
booleanSpecify the path to the DeepARG database.
stringSpecify the numeric version number of a user supplied DeepaRG database.
integer2Specify which model to use (short or long sequences).
stringSpecify minimum probability cutoff under which hits are discarded.
number0.8Specify E-value cutoff under which hits are discarded.
number1e-10Specify percent identity cutoff for sequence alignment under which hits are discarded.
integer50Specify alignment read overlap.
number0.8Specify minimum number of alignments per entry for DIAMOND step of DeepARG.
integer1000Antimicrobial resistance gene detection using a deep learning model
Skip fARGene during the ARG-screening.
booleanSpecify comma-separated list of which pre-defined HMM models to screen against
stringclass_a,class_b_1_2,class_b_3,class_c,class_d_1,class_d_2,qnr,tet_efflux,tet_rpg,tet_enzymeSpecify to save intermediate temporary files to results directory.
booleanThe threshold score for a sequence to be classified as a (almost) complete gene.
numberThe minimum length of a predicted ORF retrieved from annotating the nucleotide sequences.
integer90Defines which ORF finding algorithm to use.
booleanThe translation table/format to use for sequence annotation.
stringpearsonAntimicrobial resistance gene detection, based on alignment to the CARD database
Skip RGI during the ARG-screening.
booleanSave RGI output .json file.
booleanSpecify to save intermediate temporary files the results directory.
booleanSpecify the alignment tool to be used.
stringInclude all of loose, strict and perfect hits (i.e. >=95% identity) found by RGI.
booleantrueSuppresses the default behaviour of RGI with --arg_rgi_includeloose.
booleantrueInclude screening of low quality contigs for partial genes.
booleanSpecify a more specific data-type of input (e.g. plasmid, chromosome)
stringAntimicrobial resistance gene detection, based on alignment to CBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB.
Skip ABRicate during the ARG-screening.
booleanSpecify which of the provided public databases to use by ABRicate.
stringMinimum percent identity of alignment required for a hit to be considered.
integer80Minimum percent coverage of alignment required for a hit to be considered.
integer80Biosynthetic gene cluster detection
Skip antiSMASH during the BGC screening
booleanPath to user-defined local antiSMASH database.
stringPath to user-defined local antiSMASH directory. Only required when running with docker/singularity.
stringMinimum longest-contig length a sample must have to be screened with antiSMASH.
integer1000Minimum length a contig must have to be screened with antiSMASH.
integer1000Turn on clusterblast comparison against database of antiSMASH-predicted clusters.
booleanTurn on clusterblast comparison against known gene clusters from the MIBiG database.
booleanTurn on clusterblast comparison against known subclusters responsible for synthesising precursors.
booleanTurn on ClusterCompare comparison against known gene clusters from the MIBiG database.
booleanGenerate phylogenetic trees of secondary metabolite group orthologs.
booleanDefines which level of strictness to use for HMM-based cluster detection
stringSpecify which taxonomic classification of input sequence to use
stringA deep learning genome-mining strategy for biosynthetic gene cluster prediction
Skip deepBGC during the BGC screening.
booleanPath to local deepBGC database folder.
stringAverage protein-wise DeepBGC score threshold for extracting BGC regions from Pfam sequences.
number0.5Run DeepBGC’s internal Prodigal step in single mode to restrict detecting genes to long contigs
booleanMerge detected BGCs within given number of proteins.
integerMerge detected BGCs within given number of nucleotides.
integerMinimum BGC nucleotide length.
integer1Minimum number of proteins in a BGC.
integer1Minimum number of protein domains in a BGC.
integer1Minimum number of known biosynthetic (as defined by antiSMASH) protein domains in a BGC.
integerDeepBGC classification score threshold for assigning classes to BGCs.
number0.5Biosynthetic gene cluster detection
Skip GECCO during the BGC screening.
booleanEnable unknown region masking to prevent genes from stretching across unknown nucleotides.
booleanThe minimum number of coding sequences a valid cluster must contain.
integer3The p-value cutoff for protein domains to be included.
number1e-9The probability threshold for cluster detection.
number0.8The minimum number of annotated genes that must separate a cluster from the edge.
integerBiosynthetic Gene Cluster detection based on predefined HMM models
Skip HMMsearch during BGC-screening.
booleanSpecify path to the BGC hmm model file(s) to search against. Must have quotes if wildcard used.
stringSaves a multiple alignment of all significant hits to a file.
booleanSave a simple tabular file summarising the per-target output.
booleanSave a simple tabular file summarising the per-domain output.
booleanInfluences parameters required for the reporting workflow.
Specifies summary output format
stringReference genome related files and options required for the workflow.
Name of iGenomes reference.
stringPath to FASTA genome file.
string^\S+\.fn?a(sta)?(\.gz)?$Do not load the iGenomes reference config.
booleanParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringSet the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer16Maximum amount of memory that can be requested for any single job.
string128.GB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Maximum amount of time that can be requested for any single job.
string240.h^(\d+\.?\s*(s|m|h|d|day)\s*)+$Less common options for the pipeline, typically set in a config file.
Display help text.
booleanDisplay version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringBoolean whether to validate parameters against the schema at runtime
booleantrueShow all params when using --help
booleanValidation of parameters fails when an unrecognised parameter is found.
booleanValidation of parameters in lenient more.
boolean