Publication dateJul 10, 2025
Confidentiality public Public
UMAP Dimensionality Reduction
TASK
Typing name : TASK.gws_design_of_experiments.UMAPTask Brick : gws_design_of_experiments UMAP for dimensionality reduction and visualization
Performs UMAP (Uniform Manifold Approximation and Projection) dimensionality reduction.
This task reduces high-dimensional data to 2D or 3D for visualization and
optionally performs clustering to identify groups in the data.
The task performs the following steps:
- Optionally scales the data using StandardScaler
- Applies UMAP dimensionality reduction
- Optionally performs K-Means clustering on the UMAP embedding
- Generates interactive visualizations
Inputs:
- data: Table containing the features to reduce
Outputs:
- umap_plot: Interactive plot of UMAP embedding with optional clusters
- umap_table: Table containing UMAP coordinates and cluster assignments
Configuration:
- n_neighbors: Number of neighbors for UMAP (controls local vs global structure)
- min_dist: Minimum distance between points in low-dimensional space
- metric: Distance metric to use
- scale_data: Whether to standardize features before UMAP
- n_clusters: Number of clusters for K-Means (optional)
- color_by: Column name to color points by (optional)
- columns_to_exclude: Comma-separated list of column names to exclude from UMAP analysis
logout
Output
UMAP 2D Plot
Interactive UMAP 2D embedding visualization
UMAP 3D Plot
Interactive UMAP 3D embedding visualization
UMAP 2D Table
Table with UMAP 2D coordinates and cluster assignments
UMAP 3D Table
Table with UMAP 3D coordinates and cluster assignments
settings
Configuration
Controls how UMAP balances local vs global structure
Type : intDefault value : 15Minimum distance between points in the embedding
Type : floatDefault value : 0.1Distance metric to use (euclidean, manhattan, cosine, etc.)
Type : stringAllowed values : euclidean manhattan chebyshev minkowski canberra braycurtis mahalanobis wminkowski seuclidean cosine correlation haversine hamming jaccard dice russelrao kulsinski ll_dirichlet hellinger rogerstanimoto sokalmichener sokalsneath yule Default value : euclideanWhether to scale the data before applying UMAP
Type : boolDefault value : trueNumber of clusters for K-Means clustering (optional)
Type : intColumn name to color points by (optional)
Type : string columns_to_exclude
Optional
List of column names to exclude from UMAP analysis
Type : list hover_data_columns
Optional
List of column names to display as metadata on hover
Type : listTechnical bricks to reuse or customize
Have you developed a brick?
Share it to accelerate projects for the entire community.