Research Interests
Development of methods for bioinformatics, currently with focus on microbial communities, environmental data and metagenomics. I am also interested in many related fields: binning, profiling and assembly of community samples, amplicon and marker gene analysis, phylogeny, taxonomic classification, species delineation, contamination, high performance algorithms, sequence mapping/alignment, antimicrobial resistance, genome assembly and finishing methods.
Software
§ ganon2 is a k-mer based classification tool which uses Hierarchical Interleaved Bloom Filters to classifiy genomic sequences against large sets of references efficiently, with integrated download and update of databases (refseq/genbank), taxonomic profiling (ncbi/gtdb), binning and hierarchical classification, customized reporting and more.
§ genome_updater is a portable bash script to download and update files from NCBI genomes, keeping log and version for each update, with file check and parallel download support.
§ MultiTax is a Python package that provides a common and generalized set of functions to download, parse, filter and explore multiple biological taxonomies (GTDB, NCBI, Silva, Greengenes, Open Tree taxonomy) and custom formatted taxonomies.
§ GRIMER performs analysis of microbiome data and generates a portable and interactive dashboard integrating annotation, taxonomy and metadata with focus on contamination detection.
§ TaxSBP is am implementation of the approximation algorithm for the hierarchically structured bin packing problem based on the NCBI Taxonomy database.
§ MetaMeta is a pipeline to execute and integrate results from metagenome analysis tools. It provides an easy workflow to run multiple tools with multiple samples, producing a single enhanced output profile for each sample.
§ DUDes is a reference-based taxonomic profiler with a top-down approach to analyze metagenomic NGS samples. Instead of using the lowest common ancestor we developed the deepest uncommon descendent.
§ FGAP is an automated gap closing tool. It uses BLAST to align multiple contigs against a draft genome assembly aiming to find sequences that overlap gaps. The algorithm selects the best sequence to fill and eliminate the gap.
§ genome_updater is a portable bash script to download and update files from NCBI genomes, keeping log and version for each update, with file check and parallel download support.
§ MultiTax is a Python package that provides a common and generalized set of functions to download, parse, filter and explore multiple biological taxonomies (GTDB, NCBI, Silva, Greengenes, Open Tree taxonomy) and custom formatted taxonomies.
§ GRIMER performs analysis of microbiome data and generates a portable and interactive dashboard integrating annotation, taxonomy and metadata with focus on contamination detection.
§ TaxSBP is am implementation of the approximation algorithm for the hierarchically structured bin packing problem based on the NCBI Taxonomy database.
§ MetaMeta is a pipeline to execute and integrate results from metagenome analysis tools. It provides an easy workflow to run multiple tools with multiple samples, producing a single enhanced output profile for each sample.
§ DUDes is a reference-based taxonomic profiler with a top-down approach to analyze metagenomic NGS samples. Instead of using the lowest common ancestor we developed the deepest uncommon descendent.
§ FGAP is an automated gap closing tool. It uses BLAST to align multiple contigs against a draft genome assembly aiming to find sequences that overlap gaps. The algorithm selects the best sequence to fill and eliminate the gap.