Configuration files
HaploCoV uses several configuration files to set up your analyses. If these configuration files are not found and/or do not have the correct format, HaploCoV will halt its execution and raise an error message. By default all configuration files need to be in the same folder from where HaploCoV is executed (default: HaploCoV folder). Users are kindly asked to double check that all the configuration files are in place before executing their analyses.
The main configuration files required by HaploCoV are:
1. globalAnnot.gz: a file with functional annotation of the complete collection of SARS-CoV-2 genomic variants. This file can be generated by using CorGAT (see Chiara et al 2020). For your convenience, an up to date copy is included in the main repository of HaploCoV. The file is updated on a bi-weekly basis (every Wednesday). The most recent copy is automatically downloaded by the tool/tools that use this file at every execution.
2. areaFile: This configuration file defines macro-geographic areas as described in Chiara et al 2022. A copy of the file is found in the main repository.
3. linDefMut: contains defining mutations for every Pango Lineage. Updated on a bi-weekly basis (every Wednesday). The most recent version of the file is downloaded automatically by the tools in HaploCoV that use it.
4. scalingFactors.csv: this file contains the list of features used by HaploCoV to compute its VOC-ness” score. Please refer to the HaploCoV paper for more details. Similarly to globalAnnot.gz all the tools in HaploCoV that use this file will try to download the most recent version for you.
5. parameters: this is a simple text file that provides the default/standard parameters for the execution of the HaploCoV workflow. The file is used by HaploCoV.pl and is included in the main repository on Github. Since this file does not need to be updated/modified HaploCoV.pl does not download a novel copy of parameters at every execution.