Back to bricks list
Introduction Version

# Spearman correlation

Typing name :  TASK.gws_stats.SpearmanCorrelation Brick :  gws_stats

Compute Spearman correlation coefficients between two groups with p-value

Compute the Spearman correlation coefficient for pairwise samples, with its p-value.

The Spearman rank-order correlation coefficient is a nonparametric measure of the monotonicity of the relationship between two datasets. Unlike the Pearson correlation, the Spearman correlation does not assume that both datasets are normally distributed. The p-value returned is a two-sided p-value. Like other correlation coefficients, these ones vary between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact linear relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.

• Input: a table containing the sample measurements, with the name of the samples.
• Output: a table listing the correlation coefficient, and its associated p-value for each pairwise comparison testing.
• Config Parameters:
• `preselected_column_names`: List of columns to pre-select for pairwise comparisons. By default a maximum pre-defined number of columns are selected (see configuration).
• `reference_column`: If given, this reference column is compared against all the other columns.
• `row_tag_key`: If give, this parameter is used for group-wise comparisons along row tags (see example below). This parameter is ignored of a `reference_column` is given.
• `adjust_pvalue`:
• `method`: The correction method for p-value adjustment in multiple testing.
• `alpha`: The FWER, family-wise error rate. Default is 0.05.

# Example 1: Direct column comparisons

Let's say you have the following table.

A B C
1 5 3
2 6 8
3 7 5
4 8 4

This task performs pairwise comparison of almost all the columns of the table (the first `500` columns are pre-selected by default).

• `A` will be compared with `B` and with `C`, respectively
• `B` will be compared with `C`

To only compare a given column with all the others, set the name of the `reference_column` (a.k.a Reference column). Suppose that `B` is used as reference column,only the following comaprisons will be done:

• `B` versus `A`
• `B` versus `C`

It is also possible to perform comparison on a well-defined subset of the table by pre-selecting the columns of interest. Parameter `preselected_column_names` (a.k.a. Selected columns names) allows pre-selecting a subset of columns for analysis.

# Example 2: Advanced comparisons along row tags using `row_tag_key` parameter

In general, the table rows represent real-world observations (e.g. measured samples) and columns correspond to descriptors (a.k.a features or variables). Theses rows (samples) may therefore be related to metadata information given by row tags as follows:

row_tags A B C
Gender : M
Age : 10
1 5 3
Gender : F
Age : 10
2 6 8
Gender : F
Age : 10
3 7 5
Gender : M
Age : 20
4 8 4

Actually, the column `row_tags` does not really exist in the table. It is just to show here the tags of the rows Here, the first row correspond to 10-years old male individuals. In this this case, we may be interested in only comparing each columns along row metadata tags. For instance, to compare `Males (M)` versus `Females (F)` of each columns separately, you can use the advance parameter `row_tag_key`=`Gender`.

For more details on the Spearman correlation coefficient, see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.spearmanr.htm.

Table
The input table

### Output

Result
The output result

### Configuration

preselected_column_names

Optional

The names of column to pre-select for comparison. By default, the first 500 columns are used

Type : `List`Maximum occurrences number : `-1`

name

Optional

The name of the column(s) to pre-select

Type : `string`

is_regex

Optional

Set True if it is a text pattern (regular expression), False otherwise

Type : `bool`

reference_column

Optional

The column used as reference for pairwise comparison. Only this column is compared with the others.

Type : `string`

row_tag_key

The key of the row tag (representing the group axis) along which one would like to compare each column. This parameter is not used if a `reference column` is given.

Type : `string`

Type : `List`Maximum occurrences number : `1`

method

Type : `string`Allowed values : `bonferroni`  `fdr_bh`  `fdr_by`  `fdr_tsbh`  `fdr_tsbky`  `sidak`  `holm-sidak`  `holm`  `simes-hochberg`  `hommel`  Default value : `bonferroni`
Type : `float`Default value : `0.05`