B_HIT.sVDJ.tl.compute_correlation#

B_HIT.sVDJ.tl.compute_correlation(cloneRich, groupby_cols, corr1, corr2, save=False, path=None, compute_corr_matrix=False)#

Compute Pearson correlation between two variables, grouped by specific columns.

Parameters:

cloneRich (pd.DataFrame) – The input DataFrame containing the data.
groupby_cols (list of str) – List of columns to group by (e.g., [‘Cregion_simple’, ‘tissue’]).
corr1 (str) – The name of the first column to compute correlation for (e.g., ‘BaggArea’).
corr2 (str) – The name of the second column to compute correlation for (e.g., ‘gini_index’).
save (bool, optional) – Whether to save the output to CSV. Default is False.
path (str, optional) – The file path to save the output CSV. Default is None.
compute_corr_matrix (bool, optional) – Whether to compute and return the correlation matrix and p-value matrix. Default is False.

Returns:

pd.DataFrame: A DataFrame containing the correlation and p-value for each group.
pd.DataFrame, optional: A correlation matrix (if compute_corr_matrix is True).
pd.DataFrame, optional: A p-value matrix (if compute_corr_matrix is True).

B_HIT.sVDJ.tl.compute_correlation