Login

Need to reduce your Table's dimension? Here is the solution!

CL
Chloé Ladreyt
Feb 12, 2024

Co-authors : 
MB
Maëva Beugin

Introduction


If you need to reduce the dimensions of your Table we have the solution!


Melt is a function used to reshape a dataframe from a wide format to a long format. Here is an example.




As you can see, after melting your DataFrame you obtain 3 columns :


  • the unique identifier/ID
    • the variable column with the former columns names as values , "variable" column
      • the value columns with all the values from the former columns, "value" column

        This task can be useful in some cases for analyses or visualisations that require the long format. 


        For example, melt can be used for Principal Component Analysis (PCA). A PCA aims to reduce the number of variable of a dataset while preserving as much information as possible. It may be easier to plot the PCA if the data are in a long format rather than a wide format.



        Parameters


        The task Melt available on Constellab is quite simple, thus it has some optional parameters :


        • id_vars : column(s) to use for ID, will not be melted
          • value_vars : columns to melt, you can melt as many as you want, if empty, will melt all the columns
            • var_name : name to use for the 'variable' column
              • value_name : name to use for the 'value' column
                • col_level : if columns are a MultiIndex then use this level to melt.
                  • ignore_index : if True, original index is ignored. If False, the original index is retained. Index labels will be repeated as necessary.

                    For more information, click here


                    Prerequesites


                    To perfom this tutorial, you need :


                    • An access to Constellab and a digital lab
                      • The brick "gws_core" (version>0.5.16)
                        • A dataset in the wide format. Here we will use the Iris dataset (available in the ressources of this story).


                          Steps to follow


                          1. Upload your dataset and convert it as a Table
                            1. Link this to the Melt task
                              1. Check the parameters, "variety" for id_vars was the only added parameter for this tutorial
                                1. Save your parameters
                                  1. Run your experiment
                                    1. Get your transformed Table!



                                      Results


                                      As an example, we will use the Iris dataset, it contains 50 samples from 3 species of iris, with the sepal width and length, and the petal width and length.



                                      After running the experiment you will obtain this Table:



                                      You can see that the dataset is now in long format, ready for further analysis!