gws_design_of_experiments

Getting Started

Optimal Design of Experiments for Bioprocessing and Beyond


1. Overview


The gws_design_of_experiment brick is designed to facilitate optimal design of experiments (DoE) with a focus on bioprocessing and other domains. It integrates machine learning (ML) tools and advanced optimization algorithms to help researchers and engineers extract meaningful insights, optimize processes, and uncover causal relationships in complex datasets.


2. Key Features


2.1 Machine Learning Tools


The brick provides a suite of ML tools for dimensionality reduction, feature extraction, and predictive modeling:



2.2 Advanced Tools


A. Causal Effect Analysis


  • Purpose: Uncover causal relationships between variables, going beyond correlation to infer cause-and-effect.
    • Methodologies:
      • Double Machine Learning (DML): Implemented using the EconML package.
        • LinearDML: For discrete treatments.
          • CausalForestDML: For continuous treatments.
          • Output: Estimates the Average Treatment Effect (ATE) for all specified treatment-target pairs.

            B. Genetic Algorithms


            • Purpose: Optimize complex, multi-objective problems (e.g., optimal medium computation).
              • Methodologies:
                • NSGA-II: Non-dominated Sorting Genetic Algorithm II for multi-objective optimization.
                  • GA: Classic Genetic Algorithm for single-objective optimization.

                  3. Complementarity: Causality vs. Multivariate Correlation Analysis



                  Synergy: Use correlation analysis to explore relationships and causality analysis to validate hypotheses and guide interventions.


                  4. Practical Applications


                  • Bioprocessing: Optimize medium composition, fermentation conditions, and yield.
                    • Manufacturing: Improve process parameters for quality and efficiency.
                      • Healthcare: Identify causal factors in clinical outcomes.

                        5. Getting Started


                        Prerequisites


                        • Python 3.8+
                          • Required packages: scikit-learn, umap-learn, econml, deap (for genetic algorithms).

                            Example Workflow


                            1. Data Preparation: Load and preprocess your dataset.
                              1. Exploratory Analysis: Use PCA/UMAP to visualize data structure.
                                1. Causal Analysis: Apply DML to estimate treatment effects.
                                  1. Optimization: Use NSGA-II to find optimal process parameters.

                                    6. References



                                    Shine Logo
                                    Technical bricks to reuse or customize

                                    Have you developed a brick?

                                    Share it to accelerate projects for the entire community.