Kegg enrichment analysis - gws_gena - Constellab Community

Menu

Getting Started

Use Cases

Cell metabolic network reconstruction

Contextualise a metabolic model

Contextualise with genomic data

Cell growth prediction

Gene Knockout simulations

Kegg enrichment analysis

Technical documentations

Table of contents

Purpose Requirements Input Parameters Outputs How overlap / pvalue / padj are computed Why can a PNG be produced with no colored genes?

Contributor(s)

Publication date

Mar 9, 2022

Confidentiality

Public

Reactions

1

Share

Kegg enrichment analysis

Purpose

This task takes a DEG table, validates/normalizes an NCBI Entrez GeneID column, then:

converts Entrez IDs to KEGG gene IDs (hsa:1234, dme:xxxx, …) using KEGG REST

builds the species KEGG universe (genes ↔ pathways)

computes KEGG pathway enrichment (hypergeometric) + BH/FDR correction

renders Pathview PNGs for enriched pathways, supporting multiple fold-change columns on the same PNG (Pathview multi-state → .multi.png).

Requirements

The analysis column must contain NCBI Entrez GeneIDs: digits only (1234, 318213, …) no symbols, no Ensembl IDs, no “GeneID:…” values like 1234.0 should be normalized to 1234 (the pipeline typically cleans this, but clean input is safer).

digits only (1234, 318213, …)

no symbols, no Ensembl IDs, no “GeneID:…”

values like 1234.0 should be normalized to 1234 (the pipeline typically cleans this, but clean input is safer).

If you start from other identifiers (Ensembl, symbols…), first run OmiX “Gene ID conversion” to obtain ENTREZGENE.

Input

A DEG table (CSV/TSV) containing:

one Entrez GeneID column

optional one or more fold-change columns

Parameters

Organism (autocomplete): scientific name (e.g., Homo sapiens, Drosophila melanogaster, Saccharomyces cerevisiae). The task automatically maps it to the KEGG code (hsa, dme, sce, …).

col_entrez: column name containing Entrez GeneIDs.

foldchange_cols (optional): comma-separated list of fold-change columns to display in Pathview. Example: FoldChange,FoldChange 21 column → *.pathview.png≥2 columns → *.pathview.multi.png

min_genes_required (internal/script): minimum number of valid Entrez genes required to proceed.

Text editor image

Outputs

pathways (ResourceSet): Pathview PNG images typical names: hsa00480.pathview.png (single FC) hsa00480.pathview.multi.png (multiple FC)

kegg_enrichment (Table): enrichment results with at least: pathway, pathway_name, overlap, pathway_size, query_size, universe_size, pvalue, padj

How overlap / pvalue / padj are computed

Universe: all KEGG genes for the species (from link/pathway/<specie>).

Query: your mapped genes intersecting the universe.

For each pathway: K = pathway gene set size in universe n = query size k = overlap = number of query genes in the pathway

pvalue: hypergeometric right-tail (probability of ≥k overlaps by chance).

padj: Benjamini–Hochberg (FDR) correction across all tested pathways.

Why can a PNG be produced with no colored genes?

Even if a pathway is enriched, Pathview may show no coloring when:

gene IDs don’t match what Pathview expects (wrong id type / wrong formatting)

fold-change values become NA / non-numeric after parsing

the map contains mostly compound/KO nodes rather than gene nodes relevant to your IDs

The enrichment table remains valid; coloring depends on ID ↔ node compatibility in Pathview.

Technical bricks to reuse or customize

Have you developed a brick?

Share it to accelerate projects for the entire community.