🔍 Introduction
EggNOG-mapper is a tool for fast functional annotation of novel sequences. It uses precomputed Orthologous Groups (OGs) and phylogenies from the EggNOG database (http://eggnog5.embl.de) to transfer functional information from fine-grained orthologs only.
Common uses of eggNOG-mapper include the annotation of novel genomes, transcriptomes or even metagenomic gene catalogs.
The use of orthology predictions for functional annotation permits a higher precision than traditional homology searches (i.e. BLAST searches), as it avoids transferring annotations from close paralogs (duplicate genes with a higher chance of being involved in functional divergence).
Benchmarks comparing different eggNOG-mapper options against BLAST and InterProScan are available here (https://github.com/eggnogdb/emapper-benchmark/blob/master/benchmark_analysis.ipynb)
This tool significantly improves annotation accuracy compared to traditional homology-based tools like BLAST, making it a preferred solution for researchers working on novel genomes, transcriptomes, or metagenomic datasets.
EggNOG-mapper has been wrapped into a Constellab Task, enabling easy integration into bioinformatics workflows with reproducibility and scalability.
🧰 Prerequisites
- Access to Constellab and a valid Digital Lab environment
- Installed bricks:
gws_microbial_genomics version ≥ 0.1.1 - Input files:
A FASTA file of proteins or nucleotide sequences (CDS/genome/metagenome/proteins)
🧪 Use Case Steps
I. Functional Annotation with EggNOG
- Import your FASTA file into Constellab.
- Start by calling the
eggNOG DB download task to build eggnog database and then link your FASTA file to "eggNOG Mapper"
- Configure Parameters:
itype: Choose your sequence type (proteins, CDS, genome, or metagenome).
cpus: Number of CPU threads to allocate (default: 25).
- Run the Task: The task automatically:
Checks and downloads required EggNOG databases.
Runs
emapper.py using DIAMOND for alignment.
Extracts and cleans the final annotation file.

Output
The task produces a clean, standardized annotation.tsv file containing:
- Ortholog assignments
- Functional categories (COG, GO, KEGG, etc.)
- Predicted functions and pathway mappings for each input sequence
This table can then be used for downstream tasks such as enrichment analysis or KEGG pathway visualization.