Menu
Introduction
Getting Started
Use cases
Technical documentations
Version
Publication date

Jul 10, 2025

Confidentiality
Public
Reactions
0
Share

UMAP Dimensionality Reduction

TASK
Typing name :  TASK.gws_design_of_experiments.UMAPTask Brick :  gws_design_of_experiments

UMAP for dimensionality reduction and visualization

Performs UMAP (Uniform Manifold Approximation and Projection) dimensionality reduction.

This task reduces high-dimensional data to 2D or 3D for visualization and optionally performs clustering to identify groups in the data.

The task performs the following steps:

  1. Optionally scales the data using StandardScaler
  2. Applies UMAP dimensionality reduction
  3. Optionally performs K-Means clustering on the UMAP embedding
  4. Generates interactive visualizations

Inputs: - data: Table containing the features to reduce

Outputs: - umap_plot: Interactive plot of UMAP embedding with optional clusters - umap_table: Table containing UMAP coordinates and cluster assignments

Configuration: - n_neighbors: Number of neighbors for UMAP (controls local vs global structure) - min_dist: Minimum distance between points in low-dimensional space - metric: Distance metric to use - scale_data: Whether to standardize features before UMAP - n_clusters: Number of clusters for K-Means (optional) - color_by: Column name to color points by (optional) - columns_to_exclude: Comma-separated list of column names to exclude from UMAP analysis

Input

Data
Input data for UMAP

Output

UMAP 2D Plot
Interactive UMAP 2D embedding visualization
UMAP 3D Plot
Interactive UMAP 3D embedding visualization
UMAP 2D Table
Table with UMAP 2D coordinates and cluster assignments
UMAP 3D Table
Table with UMAP 3D coordinates and cluster assignments

Configuration

n_neighbors

Optional

Controls how UMAP balances local vs global structure

Type : intDefault value : 15

min_dist

Optional

Minimum distance between points in the embedding

Type : floatDefault value : 0.1

metric

Optional

Distance metric to use (euclidean, manhattan, cosine, etc.)

Type : stringAllowed values : euclidean manhattan chebyshev minkowski canberra braycurtis mahalanobis wminkowski seuclidean cosine correlation haversine hamming jaccard dice russelrao kulsinski ll_dirichlet hellinger rogerstanimoto sokalmichener sokalsneath yule Default value : euclidean

scale_data

Optional

Whether to scale the data before applying UMAP

Type : boolDefault value : true

n_clusters

Optional

Number of clusters for K-Means clustering (optional)

Type : int

color_by

Optional

Column name to color points by (optional)

Type : string

columns_to_exclude

Optional

List of column names to exclude from UMAP analysis

Type : list

hover_data_columns

Optional

List of column names to display as metadata on hover

Type : list
Technical bricks to reuse or customize

Have you developed a brick?

Share it to accelerate projects for the entire community.