Kegg enrichment analysis


Purpose


This task takes a DEG table, validates/normalizes an NCBI Entrez GeneID column, then:


  1. converts Entrez IDs to KEGG gene IDs (hsa:1234, dme:xxxx, …) using KEGG REST
    1. builds the species KEGG universe (genes ↔ pathways)
      1. computes KEGG pathway enrichment (hypergeometric) + BH/FDR correction
        1. renders Pathview PNGs for enriched pathways, supporting multiple fold-change columns on the same PNG (Pathview multi-state → .multi.png).

          Requirements


          • The analysis column must contain NCBI Entrez GeneIDs: digits only (1234, 318213, …) no symbols, no Ensembl IDs, no “GeneID:…” values like 1234.0 should be normalized to 1234 (the pipeline typically cleans this, but clean input is safer).
            • digits only (1234, 318213, …)
              • no symbols, no Ensembl IDs, no “GeneID:…”
                • values like 1234.0 should be normalized to 1234 (the pipeline typically cleans this, but clean input is safer).
                • If you start from other identifiers (Ensembl, symbols…), first run OmiX “Gene ID conversion to obtain ENTREZGENE.

                  Input


                  A DEG table (CSV/TSV) containing:


                  • one Entrez GeneID column
                    • optional one or more fold-change columns

                      Parameters


                      • Organism (autocomplete): scientific name (e.g., Homo sapiens, Drosophila melanogaster, Saccharomyces cerevisiae). The task automatically maps it to the KEGG code (hsa, dme, sce, …).
                        • col_entrez: column name containing Entrez GeneIDs.
                          • foldchange_cols (optional): comma-separated list of fold-change columns to display in Pathview. Example: FoldChange,FoldChange 21 column → *.pathview.png≥2 columns → *.pathview.multi.png
                            • min_genes_required (internal/script): minimum number of valid Entrez genes required to proceed.


                              Text editor image

                              Outputs


                              1. pathways (ResourceSet): Pathview PNG images typical names: hsa00480.pathview.png (single FC) hsa00480.pathview.multi.png (multiple FC)
                                1. kegg_enrichment (Table): enrichment results with at least: pathway, pathway_name, overlap, pathway_size, query_size, universe_size, pvalue, padj

                                  How overlap / pvalue / padj are computed


                                  • Universe: all KEGG genes for the species (from link/pathway/<specie>).
                                    • Query: your mapped genes intersecting the universe.
                                      • For each pathway: K = pathway gene set size in universe n = query size k = overlap = number of query genes in the pathway
                                        • pvalue: hypergeometric right-tail (probability of ≥k overlaps by chance).
                                          • padj: Benjamini–Hochberg (FDR) correction across all tested pathways.

                                            Why can a PNG be produced with no colored genes?


                                            Even if a pathway is enriched, Pathview may show no coloring when:


                                            • gene IDs don’t match what Pathview expects (wrong id type / wrong formatting)
                                              • fold-change values become NA / non-numeric after parsing
                                                • the map contains mostly compound/KO nodes rather than gene nodes relevant to your IDs

                                                  The enrichment table remains valid; coloring depends on ID ↔ node compatibility in Pathview.


                                                  Shine Logo
                                                  Technical bricks to reuse or customize

                                                  Have you developed a brick?

                                                  Share it to accelerate projects for the entire community.