For impatient people

To apply the full workflow implemented by HaploCoV.pl you will need to execute 7 different tools. Detailed instructions for how to execute/use each of them are provided in the following sections. Here you will find a brief outline.

Importing GISAID data:

1. `perl addToTable.pl --metadata metadata.tsv --seq sequences.fasta --nproc 16 --outfile linearDataSorted.txt `

Importing Nexstrain data:

1. perl NextStrainToHaploCoV.pl --metadata metadata.tsv --outfile linearDataSorted.txt

then compute AF from the data:

2. perl computeAF.pl --file linearDataSorted.txt

OR download a genomics variant file:

2. wget https://raw.githubusercontent.com/matteo14c/HaploCoV/updates/area_list.txt

Identify novel designations (3):

3. perl augmentClusters.pl --outfile lvar.txt --metafile linearDataSorted.txt --posFile areas_list.txt

Compute features/voc-ness score(4+5):

4. perl LinToFeats.pl --infile lvar.txt --outfile lvar_feats.tsv
5. perl report.pl --file lvar_feats.tsv --outfile lvar_prioritization.txt

Assign the novel designations(6):

6. perl assign.pl --dfile lvar.txt --metafile linearDataSorted.txt --outfile --out HaploCoVAssignedVariants.txt

OR alternatively (parallel version, faster):

6. perl p_assign.pl --dfile  lvar.txt --metafile linearDataSorted.txt --nproc 12 --out HaploCoVAssignedVariants.txt

Compute the prevalence (7):

7. perl increase.pl --file HaploCoVAssignedVariants.txt