Output files:)

ganon build/build-custom/update:)

Every run on ganon build, ganon build-custom or ganon update will generate the following database files:

  • {prefix}.ibf/.hibf: main bloom filter index file, extension based on the --filter-type option.
  • {prefix}.tax: taxonomy tree, only generated if --taxonomy is used (fields: target/node, parent, rank, name, genome size).
  • {prefix}_files/: (ganon build only) folder containing downloaded reference sequence and auxiliary files. Not necessary for classification. Keep this folder if the database will be update later. Otherwise it can be deleted.

Warning

Database files generated with version 1.2.0 or higher are not compatible with older versions.

ganon classify:)

  • {prefix}.tre: full report file (see below)
  • {prefix}.rep: plain report of the run with only targets that received a match. Can be used to re-generate full reports (.tre) with ganon report. At the end prints 2 extra lines with #total_classified and #total_unclassified. Fields
    • 1: hierarchy label
    • 2: target
    • 3: # total matches
    • 4: # unique reads
    • 5: # lca reads
    • 6: rank
    • 7: name
  • {prefix}.one: output with one match for each classified read after EM or LCA algorithm. Only generated with --output-one active. If multiple hierarchy levels are set, one file for each level will be created: {prefix}.{hierarchy}.one (fields: read identifier, target, (max) k-mer/minimizer count)
  • {prefix}.all: output with all matches for each read. Only generated with --output-all active Warning: file can be very large. If multiple hierarchy levels are set, one file for each level will be created: {prefix}.{hierarchy}.all (fields: read identifier, target, k-mer/minimizer count)

ganon report:)

  • {prefix}.tre: tab-separated tree-like report with cumulative counts and taxonomic lineage. There are several possible --report-type. More information on the different types of reports can be found here:
    • abundance: will attempt to estimate taxonomic abundances by re-disributing read counts from LCA matches and correcting sequence abundance by approximate genome sizes.
    • reads: sequence abundances, reports the proportion of sequences assigned to a taxa, each read classified is counted once.
    • dist: like reads with read count re-distribution.
    • corr: like reads with correction by genome size.
    • matches: every match is reported to their original target, including multiple and shared matches.

Each line in this report is a taxonomic entry (including the root node), with the following fields:

col field obs example
1 rank phylum
2 target taxonomic id. or specialization (assembly id.) 562
3 lineage 1|131567|2|1224|28211|766|942|768|769
4 name Chromobacterium rhizoryzae
5 # unique number of reads that matched exclusively to this target 5
6 # shared number of reads with non-unique matches directly assigned to this target. Represents the LCA matches (--report-type reads), re-assigned matches (--report-type abundance/dist) or shared matches (--report-type matches) 10
7 # children number of unique and shared assignments to all children nodes of this target 20
8 # cumulative the sum of the unique, shared and children assignments up-to this target 35
9 % cumulative percentage of assignments or estimated relative abundance for --report-type abundance 43.24
  • The first line of the report file will show the number of unclassified reads (not for --report-type matches)

  • The CAMI challenge bioboxes profiling format is supported using --output-format bioboxes. In this format, only values for the percentage/abundance (col. 9) are reported. The root node and unclassified entries are omitted.

  • The sum of cumulative assignments for the unclassified and root lines is 100%. The final cumulative sum of reads/matches may be under 100% if any filter is successfully applied and/or hierarchical selection is selected (keep/skip/split).

  • For all report type but matches, only taxa that received direct read matches, either unique or by LCA assignment, are considered. Some reads may have only shared matches and will not be reported directly but will be accounted for on some parent level. To visualize those matches, create a report with --report-type matches or use directly the file {prefix}.rep.

ganon table:)

  • {output_file}: a tab-separated file with counts/percentages of taxa for multiple samples

Examples of output files The main output file is the `{prefix}.tre` which will summarize the results:
unclassified                                                 unclassified             0   0  0   2   2.02020
root          1       1                                      root                     0   0  97  97  97.97980
superkingdom  2       1|2                                    Bacteria                 0   0  97  97  97.97980
phylum        1239    1|2|1239                               Firmicutes               0   0  57  57  57.57576
phylum        1224    1|2|1224                               Proteobacteria           0   0  40  40  40.40404
class         91061   1|2|1239|91061                         Bacilli                  0   0  57  57  57.57576
class         28211   1|2|1224|28211                         Alphaproteobacteria      0   0  28  28  28.28283
class         1236    1|2|1224|1236                          Gammaproteobacteria      0   0  12  12  12.12121
order         1385    1|2|1239|91061|1385                    Bacillales               0   0  57  57  57.57576
order         204458  1|2|1224|28211|204458                  Caulobacterales          0   0  28  28  28.28283
order         72274   1|2|1224|1236|72274                    Pseudomonadales          0   0  12  12  12.12121
family        186822  1|2|1239|91061|1385|186822             Paenibacillaceae         0   0  57  57  57.57576
family        76892   1|2|1224|28211|204458|76892            Caulobacteraceae         0   0  28  28  28.28283
family        468     1|2|1224|1236|72274|468                Moraxellaceae            0   0  12  12  12.12121
genus         44249   1|2|1239|91061|1385|186822|44249       Paenibacillus            0   0  57  57  57.57576
genus         75      1|2|1224|28211|204458|76892|75         Caulobacter              0   0  28  28  28.28283
genus         469     1|2|1224|1236|72274|468|469            Acinetobacter            0   0  12  12  12.12121
species       1406    1|2|1239|91061|1385|186822|44249|1406  Paenibacillus polymyxa   57  0  0   57  57.57576
species       366602  1|2|1224|28211|204458|76892|75|366602  Caulobacter sp. K31      28  0  0   28  28.28283
species       470     1|2|1224|1236|72274|468|469|470        Acinetobacter baumannii  12  0  0   12  12.12121
running `ganon classify` or `ganon report` with `--ranks all`, the output will show all ranks used for classification and presented sorted by lineage (also available with `ganon report --sort lineage`):
unclassified                                                                  unclassified                                   0   0  0   2   2.02020
root           1        1                                                     root                                           0   0  97  97  97.97980
no rank        131567   1|131567                                              cellular organisms                             0   0  97  97  97.97980
superkingdom   2        1|131567|2                                            Bacteria                                       0   0  97  97  97.97980
phylum         1224     1|131567|2|1224                                       Proteobacteria                                 0   0  40  40  40.40404
class          1236     1|131567|2|1224|1236                                  Gammaproteobacteria                            0   0  12  12  12.12121
order          72274    1|131567|2|1224|1236|72274                            Pseudomonadales                                0   0  12  12  12.12121
family         468      1|131567|2|1224|1236|72274|468                        Moraxellaceae                                  0   0  12  12  12.12121
genus          469      1|131567|2|1224|1236|72274|468|469                    Acinetobacter                                  0   0  12  12  12.12121
species group  909768   1|131567|2|1224|1236|72274|468|469|909768             Acinetobacter calcoaceticus/baumannii complex  0   0  12  12  12.12121
species        470      1|131567|2|1224|1236|72274|468|469|909768|470         Acinetobacter baumannii                        12  0  0   12  12.12121
class          28211    1|131567|2|1224|28211                                 Alphaproteobacteria                            0   0  28  28  28.28283
order          204458   1|131567|2|1224|28211|204458                          Caulobacterales                                0   0  28  28  28.28283
family         76892    1|131567|2|1224|28211|204458|76892                    Caulobacteraceae                               0   0  28  28  28.28283
genus          75       1|131567|2|1224|28211|204458|76892|75                 Caulobacter                                    0   0  28  28  28.28283
species        366602   1|131567|2|1224|28211|204458|76892|75|366602          Caulobacter sp. K31                            28  0  0   28  28.28283
no rank        1783272  1|131567|2|1783272                                    Terrabacteria group                            0   0  57  57  57.57576
phylum         1239     1|131567|2|1783272|1239                               Firmicutes                                     0   0  57  57  57.57576
class          91061    1|131567|2|1783272|1239|91061                         Bacilli                                        0   0  57  57  57.57576
order          1385     1|131567|2|1783272|1239|91061|1385                    Bacillales                                     0   0  57  57  57.57576
family         186822   1|131567|2|1783272|1239|91061|1385|186822             Paenibacillaceae                               0   0  57  57  57.57576
genus          44249    1|131567|2|1783272|1239|91061|1385|186822|44249       Paenibacillus                                  0   0  57  57  57.57576
species        1406     1|131567|2|1783272|1239|91061|1385|186822|44249|1406  Paenibacillus polymyxa                         57  0  0   57  57.57576
with `--output-format bioboxes`
@Version:0.10.0
@SampleID:example.rep H1
@Ranks:superkingdom|phylum|class|order|family|genus|species|assembly
@Taxonomy:db.tax
@@TAXID  RANK          TAXPATH   TAXPATHSN                 PERCENTAGE
2        superkingdom  2         Bacteria                  100.00000
1224     phylum        2|1224    Bacteria|Proteobacteria   56.89782
201174   phylum        2|201174  Bacteria|Actinobacteria   21.84869
1239     phylum        2|1239    Bacteria|Firmicutes       9.75197
976      phylum        2|976     Bacteria|Bacteroidota     6.15297
1117     phylum        2|1117    Bacteria|Cyanobacteria    2.23146
203682   phylum        2|203682  Bacteria|Planctomycetota  1.23353
57723    phylum        2|57723   Bacteria|Acidobacteria    0.52549
200795   phylum        2|200795  Bacteria|Chloroflexi      0.31118