B_HIT.sVDJ.tl.compute_correlation#
- B_HIT.sVDJ.tl.compute_correlation(cloneRich, groupby_cols, corr1, corr2, save=False, path=None, compute_corr_matrix=False)#
Compute Pearson correlation between two variables, grouped by specific columns.
- Parameters:
cloneRich (pd.DataFrame) – The input DataFrame containing the data.
groupby_cols (list of str) – List of columns to group by (e.g., [‘Cregion_simple’, ‘tissue’]).
corr1 (str) – The name of the first column to compute correlation for (e.g., ‘BaggArea’).
corr2 (str) – The name of the second column to compute correlation for (e.g., ‘gini_index’).
save (bool, optional) – Whether to save the output to CSV. Default is False.
path (str, optional) – The file path to save the output CSV. Default is None.
compute_corr_matrix (bool, optional) – Whether to compute and return the correlation matrix and p-value matrix. Default is False.
- Returns:
- pd.DataFrame
A DataFrame containing the correlation and p-value for each group.
- pd.DataFrame, optional
A correlation matrix (if compute_corr_matrix is True).
- pd.DataFrame, optional
A p-value matrix (if compute_corr_matrix is True).