Consensus¶
Acquire consensus signatures for input samples
- pycytominer.consensus.consensus(profiles: str | DataFrame, replicate_columns: list[str] = ['Metadata_Plate', 'Metadata_Well'], operation: str = 'median', features: str | list[str] = 'infer', output_file: str | None = None, output_type: Literal['csv', 'parquet', 'anndata_h5ad', 'anndata_zarr'] | None = 'csv', compression_options: str | dict[str, Any] | None = None, float_format: str | None = None, modz_args: dict[str, int | float | str] | None = {'method': 'spearman'}) DataFrame¶
Form level 5 consensus profile data.
- Parameters:
profiles (pd.DataFrame or file) – DataFrame or file of profiles.
replicate_columns (list, defaults to ["Metadata_Plate", "Metadata_Well"]) – Metadata columns indicating which replicates to collapse
operation (str, defaults to "median") – The method used to form consensus profiles.
features (list) – A list of strings corresponding to feature measurement column names in the profiles DataFrame. All features listed must be found in profiles. Defaults to “infer”. If “infer”, then assume features are from CellProfiler output and prefixed with “Cells”, “Nuclei”, or “Cytoplasm”.
output_file (str, optional) – If provided, will write consensus profiles to file. If not specified, will return the normalized profiles as output.
output_type (str, optional) – If provided, will write consensus profiles as a specified file type (either CSV or parquet). If not specified and output_file is provided, then the file will be outputed as CSV as default.
compression_options (str or dict, optional) – Contains compression options as input to pd.DataFrame.to_csv(compression=compression_options). pandas version >= 1.2.
float_format (str, optional) – Decimal precision to use in writing output file as input to pd.DataFrame.to_csv(float_format=float_format). For example, use “%.3g” for 3 decimal precision.
modz_args (dict, optional) – Additional custom arguments passed as kwargs if operation=”modz”. See pycytominer.cyto_utils.modz for more details.
- Returns:
DataFrame of consensus features. If output_file=None, then return the DataFrame. If you specify output_file, profiles will be written on disk based on provided output_file path.
- Return type:
pd.DataFrame
Notes
Parameters: output_file, output_type, compression_options, and float_format are passed as kwargs to the write_to_file_if_user_specifies_output_details decorator, which handles writing the output DataFrame to file if the user specifies output details. If output_file is not specified, the function will return the consensus DataFrame instead of writing to file.
Examples
import pandas as pd from pycytominer import consensus data_df = pd.concat( [ pd.DataFrame( { "Metadata_Plate": "X", "Metadata_Well": "a", "Cells_x": [0.1, 0.3, 0.8], "Nuclei_y": [0.5, 0.3, 0.1], } ), pd.DataFrame( { "Metadata_Plate": "X", "Metadata_Well": "b", "Cells_x": [0.4, 0.2, -0.5], "Nuclei_y": [-0.8, 1.2, -0.5], } ), ] ).reset_index(drop=True) consensus_df = consensus( profiles=data_df, replicate_columns=["Metadata_Plate", "Metadata_Well"], operation="median", features="infer", output_file=None, )