Importing files
GRIMER is independent of any quantification method and requires a contingency table with raw counts of observations/components for each samples/compositions in the study. Observations are usually, but not limited to, taxonomic entries (e.g. genus, species, strains), operational taxonomic units (OTUs), amplicon sequence variants (ASVs), metagenome-assembled genomes (MAGs) or sequence features.
GRIMER --input-file
accepts a file with tab-separated values (.tsv) containing a table of counts (Observation table, Count table, Contingency Tables, ...) or a .biom file.
The Biological Observation Matrix file (.biom)
GRIMER parses BIOM files and affiliated metadata, if available. Alternatively, an external metadata file can be provided with -m/--metadata
and will take precedence over the .biom metadata.
Example UgandaMaternalV3V4.16s_DADA2.taxon_abundance.biom file from microbiomedb.org
- Default report (no taxonomy)
grimer --input-file UgandaMaternalV3V4.16s_DADA2.taxon_abundance.biom
- Integrated NCBI taxonomy (will translate names to taxonomy ids)
grimer --input-file UgandaMaternalV3V4.16s_DADA2.taxon_abundance.biom \
--taxonomy ncbi \
--ranks superkingdom phylum class order family genus species
- Using an external metadata file (UgandaMaternalV3V4.16s_DADA2.sample_details.tsv)
grimer --input-file UgandaMaternalV3V4.16s_DADA2.taxon_abundance.biom \
--metadata-file UgandaMaternalV3V4.16s_DADA2.sample_details.tsv \
--taxonomy ncbi \
--ranks superkingdom phylum class order family genus species
tab-separated file (.tsv)
GRIMER parses .tsv files with single taxonomic identifier/names annotations or with multi-level (e.g.: lineage) taxonomic annotated observations.
- Rows contain observations and columns contain samples (use
--transpose
if your file is reversed) - First column and first row are used as headers
- Taxonomy integration: files can have either taxonomic identifiers (NCBI, e.g.: 562) or taxonomic names (NCBI, e.g.: Escherichia coli or GTDB, e.g.: s__Escherichia coli)
Multi-level annotations (e.g. Bacteria;Proteobacteria;Gammaproteobacteria...)
- Example UgandaMaternalV3V4.16s_DADA2.taxon_abundance.tsv file from microbiomedb.org
grimer --input-file UgandaMaternalV3V4.16s_DADA2.taxon_abundance.tsv \
--level-separator ";"
- With metadata (UgandaMaternalV3V4.16s_DADA2.sample_details.tsv)
grimer --input-file UgandaMaternalV3V4.16s_DADA2.taxon_abundance.tsv \
--level-separator ";" \
--metadata-file UgandaMaternalV3V4.16s_DADA2.sample_details.tsv
- With integrated NCBI taxonomy (will translate names to taxids)
grimer --input-file UgandaMaternalV3V4.16s_DADA2.taxon_abundance.tsv \
--level-separator ";" \
--metadata-file UgandaMaternalV3V4.16s_DADA2.sample_details.tsv \
--taxonomy ncbi \
--ranks superkingdom phylum class order family genus species
Single level annotations (e.g. Neisseria animalis)
- Example ERP108433_phylum_taxonomy_abundances_SSU_v4.1.tsv from MGnify, phylum level only
# Removing first column with kingdom
cut -f 2- ERP108433_phylum_taxonomy_abundances_SSU_v4.1.tsv > ERP108433_phylum_taxonomy_abundances_SSU_v4.1_parsed.tsv
# Set identifier for unassigned observations as "Unassigned" (many occurences, will be summed)
grimer --input-file ERP108433_phylum_taxonomy_abundances_SSU_v4.1_parsed.tsv \
--unassigned-header "Unassigned"
- Re-generating taxonomic lineage from single annotations (in this case only superkingdom)
grimer --input-file ERP108433_phylum_taxonomy_abundances_SSU_v4.1_parsed.tsv \
--unassigned-header "Unassigned" \
--taxonomy ncbi \
--ranks superkingdom phylum
From commonly used tools/sources
ganon
ganon table --input *.tre \
--output-file ganon_table.tsv \
--header taxid \
--rank species
grimer --input-file ganon_table.tsv \
--taxonomy ncbi \
--ranks superkingdom phylum class order family genus species
MetaPhlAn
# merge_metaphlan_tables.py is available with the metaphlan package
merge_metaphlan_tables.py *.tsv | head -n+2 > metaphlan_table.tsv
grimer --input-file metaphlan_table.tsv \
--level-separator "|" \
--obs-replace '^.+__' '' '_' ' ' \
--taxonomy ncbi \
--ranks superkingdom phylum class order family genus species
QIIME2 feature table (.qza)
- Example feature-table.qza from QIIME2 docs
qiime tools export --input-path feature-table.qza --output-path exported-feature-table
grimer --input-file exported-feature-table/feature-table.biom
phyloseq
#source("http://bioconductor.org/biocLite.R")
#biocLite("biomformat")
#biocLite('phyloseq')
library("biomformat")
library('phyloseq')
data(soilrep)
b <- make_biom(data = otu_table(soilrep))
write_biom(b, 'out.biom')
grimer --input-file out.biom
MGnify
grimer-mgnify.py
will download and generate a GRIMER report for any MGnify study accession (e.g. MGYS00006024)
# Install API dependency
conda install "jsonapi-client>=0.9.7"
./grimer-mgnify.py -i MGYS00006024 -o out_folder_mgnify/