Optimization

Typing name : TASK.gws_design_of_experiments.Optimization

Brick : gws_design_of_experiments

Optimization with virtual environment

Optimization task using machine learning models in isolated virtual environment.

This task performs optimization on experimental data by:

Training multiple machine learning models (Random Forest, XGBoost, CatBoost)
Selecting the best performing model based on cross-validation R² scores
Using algorithms (NSGA-II or GA) to find optimal solutions
Generating comprehensive optimization results and analysis files

The optimization process considers:

Target variables: Variables to maximize during optimization
Constraints: Manual bounds on input features
Thresholds: Minimum acceptable values for target variables

Generated Output Files:

generalized_solutions.csv: All optimization solutions found
best_generalized_solution.csv: Best solution based on CV and target values
actual_vs_predicted.csv: Model validation data (observed vs predicted)
feature_importance_matrix.csv: Feature importance for each target variable
constraints_used_in_optimization.csv: Bounds applied to each feature
optimization_progress.csv: Convergence history during optimization

Inputs: data (Table): Experimental data containing features and target variables targets_thresholds (JSONDict): Minimum threshold values for each target variable manual_constraints (JSONDict): Custom bounds for input features in format: {"feature_name": {"lower_bound": value, "upper_bound": value}}

Outputs: results_folder (Folder): Directory containing all optimization results and analysis files

Example: For a chemical process optimization, you might want to maximize yield and purity while keeping temperature below 100°C and pressure above 2 bar, with minimum yield of 80% and minimum purity of 95%.

Input

Data

Table

Manual constraints

Manual constraints for optimization

JSON Dict

Output

Results folder

The folder containing the results

Folder

Configuration

population_size

Optional

Population size for the optimization algorithm

Type : int

Default value : 500

iterations

Optional

Type : int

Default value : 100

columns_to_exclude

Optional

List of column names to exclude from optimization analysis

Type : list

targets_thresholds

Targets to optimize and their objective values

Type : List

Maximum occurrences number : -1

targets

Target to optimize

Type : string

thresholds

Objective value for the target

Type : int

Input

Output

Configuration

Have you developed a brick?