How to visualise a KEGG pathway using Constellab?

Introduction

KEGG (Kyoto Encyclopedia of Genes and Genomes) is a valuable resource that provides detailed information on metabolic pathways, genes, proteins, diseases and many other aspects of molecular biology. KEGG data visualisation allows researchers to visually understand the complex interactions between different biological elements. Using interactive diagrams and maps, KEGG visualization provides a clear and informative perspective for exploring and interpreting biological data, making it a valuable tool for life science research.

Let show how to view genes mapped on KEGG pathways using Constellab.

In the second part of this story we will see some usefull live task to help you to manipulate the visualisation of KEGG pathway.

Task KEGG Visualisation

Prerequesites

You will need a File containing the genes to study with a header. If you have the fold change for the gene expression between two conditions for these genes, then add them in the following columns.

Steps to follow

Upload a File with the genes and the fold change, if applicable.

Link it to the task available in the Task

Fill parameters

Run your experiment

Get your coloured KEGG pathways!

Be aware that this task can take some time, especially the first time, as a virtual environment has to be installed, and also depending on the length of the genes provided, it can take more time.

Results

In output you will get a set of pathways with the genes mapped. If you have provided the fold change, each box of the pathway will be separated depending on the number of fold changes.

If you want to have other visualisation, we provide below some useful live task.

Visualise easily a KEGG pathway in Constellab using R Live Task

The following figure shows a live task executed in a Conda environment

Conda environment

name: .venv
channels:
  - conda-forge
dependencies:
  - python=3.8 
  - requests==2.28.2

Live code

# This is a snippet template for a Python live task.
import requests
import os
def get_kegg_pathway_id_by_name(pathway_name):
    try:
        # Make the request to search for pathway entries by name
        response = requests.get(f"http://rest.kegg.jp/find/pathway/{pathway_name}")
        if response.status_code == 200:
            # Parse the response to extract the pathway ID
            lines = response.text.strip().split("\n")
            if lines:
                pathway_entry = lines[0].split("\t")
                if len(pathway_entry) == 2:
                    pathway_id = pathway_entry[0].split(":")[1]
                    return pathway_id
        print(f"Pathway with name '{pathway_name}' not found.")
        return None
    except requests.exceptions.RequestException as e:
        print(f"Error fetching pathway data: {e}")
        return None
      
def download_kegg_pathway_image(pathway_id, save_path):
    try:
        # Make the request to fetch the image data
        response = requests.get(f"http://rest.kegg.jp/get/{pathway_id}/image")
        if response.status_code == 200:
            # Save the image to the specified path
            with open(save_path, "wb") as f:
                f.write(response.content)
            print(f"Pathway image saved to {save_path}")
        else:
            print(f"Error fetching pathway image. Status code: {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"Error fetching pathway image: {e}")
if(pathway_id is None):
  pathway_id = get_kegg_pathway_id_by_name(pathway_name)
if pathway_id:
    print(f"The KEGG ID for '{pathway_name}' is: {pathway_id}")
else:
    print("Pathway not found or error occurred.")
save_path = pathway_id + "_pathway_image.png"
download_kegg_pathway_image(pathway_id, save_path)
target_paths = [save_path]

Parameters

pathway_id = None  # Replace with the pathway ID you want to explore (map00010 for example), None if not know
pathway_name ="Glycolysis"#if you don't know pathway ID, fill this with the name

Enter the id of the pathway you want to see or the name of the pathway and run the experiment: you'll get an image with the pathway entered!

Results

If you follow the previous steps you will obtain this image:

Nice, isn't it? Now you've got a dataset and you want to visualise the genes that are differentially expressed on a KEGG pathway? Let's get started!

Visualise a KEGG pathway coloured in Constellab using R Live Task

We're going to create a live task using R's pathview package.

If you want more information about the package pathview, here is the documentation.

Input

Place your dataset as input, with the columns :

- the identifier of the genes studied

- the expression value of the gene in the control condition

- the expression value of the gene in the wild type condition

Here, we will use data from this dataset. It's rat gene expression data are available for 2 conditions. If you want to reproduce this tutorial, download the "expression values across all genes" file and delete the text header and the "Gene ID" column.

Conda environment

We will use the package pathview so we need to add it. We also add 'org.Rn.eg.db' which is a genome-wide annotation for rat, so you will need to update this if you do not have rat data.

name: .venv
channels:
  - conda-forge
  - bioconda
dependencies:
  - r-base
  - bioconductor-pathview
  - bioconductor-org.Rn.eg.db

Live code

The code is as follows. It consists of calculating the log2 of the fold change of our dataset and giving the function pathview a dataframe with only these values and with gene id in the index.

# Load required packages
library(pathview)
# Read the source csv file with header, row names and comma separator
data <- read.csv(source_paths[1], header = TRUE, sep = ",")
#select data
data_pathview = data
# Compute fold_change
fold_change <- data_pathview[column_wild]/data_pathview[column_control]
# Compute log2 of the fold change
log2_fold_change <- log2(fold_change)
# Add a column to the dataframe with the value of log2(FC)
data_pathview <- cbind(data_pathview, log2_fold_change)
colnames(data_pathview)[ncol(data_pathview)] <- "log2_fold_change"
# Only keep the necessary columns 
data_pathview = data_pathview[,c(column_gene_id,"log2_fold_change")]
data_pathview <- na.omit(data_pathview)
# Set the gene id as index
row.names(data_pathview) <- data_pathview[,column_gene_id]
data_pathview[,column_gene_id] <- NULL
# Use pathview function 
pv.out <- pathview(gene.data = data_pathview, cpd.data = NULL, gene.idtype = type_gene_id, pathway.id = pathway_id, species = specie)
# Write the csv file and the image into the result folder
result_path <- "result.csv"
write.csv(data_pathview, file = result_path, col.names = TRUE,row.names = TRUE)
image_path <- paste0(specie,pathway_id,".pathview.png")
target_paths <- c(result_path, image_path)

Parameters

specie = "rno"
pathway_id = "00010"
type_gene_id = "symbol"
column_gene_id = "Gene_Name"
column_control = "native"
column_wild = "post_ex_vivo_lung_perfusion"

Indicate the species studied, then the identifier of the pathway you wish to see. Also indicate the identifier type of your genes (by default it's "entrez" for entrez gene id) and finally the names of the columns of your dataframe.

In this case, there are rat data so we put "rno", we want to see the glycolysis pathway ("00010"). Finally, genes are encoded by their symbol.

Results

With this udpated code, you can obtain a image like this:

The EC numbers colored in red indicates that the log2 of the fold change between the two conditions is near 1 so the gene is over-expressed in the wild-type condition compared with the control condition.

Conversely, if it is coloured green, this means that the gene is underexpressed in the wild-type condition compared with the control condition.

Conclusion

So this tutorial has allowed you to see how to visualise the KEGG pathway in Constellab, either by using a Task either by using the Live Rask to make your own pipeline. These visualisations will certainly help you to analyse your results after a differential gene expression analysis, for example.

How to visualise a KEGG pathway using Constellab?

Introduction

Task KEGG Visualisation

Prerequesites

Steps to follow

Results

Visualise easily a KEGG pathway in Constellab using R Live Task

Conda environment

Live code

Parameters

Results

Visualise a KEGG pathway coloured in Constellab using R Live Task

Input

Conda environment

Live code

Parameters

Results

Conclusion

Comments - 0