gws_microbial_genomics

Contributor(s)
Publication date

Jan 26, 2026

Confidentiality
Public
Reactions
0
Share

Functional annotation through orthology assignment

🔍 Introduction


EggNOG-mapper is a tool for fast functional annotation of novel sequences. It uses precomputed Orthologous Groups (OGs) and phylogenies from the EggNOG database (http://eggnog5.embl.de) to transfer functional information from fine-grained orthologs only.


Common uses of eggNOG-mapper include the annotation of novel genomes, transcriptomes or even metagenomic gene catalogs.


The use of orthology predictions for functional annotation permits a higher precision than traditional homology searches (i.e. BLAST searches), as it avoids transferring annotations from close paralogs (duplicate genes with a higher chance of being involved in functional divergence).


Benchmarks comparing different eggNOG-mapper options against BLAST and InterProScan are available here (https://github.com/eggnogdb/emapper-benchmark/blob/master/benchmark_analysis.ipynb


This tool significantly improves annotation accuracy compared to traditional homology-based tools like BLAST, making it a preferred solution for researchers working on novel genomes, transcriptomes, or metagenomic datasets.


EggNOG-mapper has been wrapped into a Constellab Task, enabling easy integration into bioinformatics workflows with reproducibility and scalability.


🧰 Prerequisites


  • Access to Constellab and a valid Digital Lab environment
    • Installed bricks: gws_microbial_genomics version ≥ 0.1.1
      • Input files: A FASTA file of proteins or nucleotide sequences (CDS/genome/metagenome/proteins)

        🧪 Use Case Steps


        I. Functional Annotation with EggNOG


        1. Import your FASTA file into Constellab.
          1. Start by calling the eggNOG DB download task to build eggnog database and then link your FASTA file to "eggNOG Mapper"
            1. Configure Parameters: itype: Choose your sequence type (proteins, CDS, genome, or metagenome). cpus: Number of CPU threads to allocate (default: 25).
              1. Run the Task: The task automatically: Checks and downloads required EggNOG databases. Runs emapper.py using DIAMOND for alignment. Extracts and cleans the final annotation file.

                Text editor image

                Output


                The task produces a clean, standardized annotation.tsv file containing:


                • Ortholog assignments
                  • Functional categories (COG, GO, KEGG, etc.)
                    • Predicted functions and pathway mappings for each input sequence

                      This table can then be used for downstream tasks such as enrichment analysis or KEGG pathway visualization.


                      Shine Logo
                      Technical bricks to reuse or customize

                      Have you developed a brick?

                      Share it to accelerate projects for the entire community.