Output¶
Utility function to compress output data
- pycytominer.cyto_utils.output.check_compression_method(compression: str)¶
Ensure compression options are set properly
- Parameters:
compression (str) – The category of compression options available
- Returns:
Asserts available options
- Return type:
None
- pycytominer.cyto_utils.output.output(df: DataFrame, output_filename: str, output_type: Literal['csv', 'parquet', 'anndata_h5ad', 'anndata_zarr', None] = 'csv', sep: str = ',', float_format: str | None = None, compression_options: str | dict[str, Any] | None = {'method': 'gzip', 'mtime': 1}, **kwargs) str¶
Given an output file and compression options, write file to disk
- Parameters:
df (pandas.core.frame.DataFrame) – a pandas dataframe that will be written to file
output_filename (str) – location of file to write
output_type (str, default "csv") – type of output file to create
sep (str) – file delimiter
float_format (str, default None) – Decimal precision to use in writing output file as input to pd.DataFrame.to_csv(float_format=float_format). For example, use “%.3g” for 3 decimal precision.
compression_options (str or dict, default {"method": "gzip", "mtime": 1}) – Contains compression options as input to pd.DataFrame.to_csv(compression=compression_options). pandas version >= 1.2.
- Returns:
returns output_filename
- Return type:
str
Examples
import pandas as pd from pycytominer.cyto_utils import output data_df = pd.concat( [ pd.DataFrame( { "Metadata_Plate": "X", "Metadata_Well": "a", "Cells_x": [0.1, 0.3, 0.8], "Nuclei_y": [0.5, 0.3, 0.1], } ), pd.DataFrame( { "Metadata_Plate": "X", "Metadata_Well": "b", "Cells_x": [0.4, 0.2, -0.5], "Nuclei_y": [-0.8, 1.2, -0.5], } ), ] ).reset_index(drop=True) output_file = "test.csv.gz" output( df=data_df, output_filename=output_file, sep=",", compression_options={"method": "gzip", "mtime": 1}, float_format=None, )
- pycytominer.cyto_utils.output.set_compression_method(compression: str | dict | None) dict[str, Any]¶
Set the compression options
- Parameters:
compression (str or dict) – Contains compression options as input to pd.DataFrame.to_csv(compression=compression_options). pandas version >= 1.2.
- Returns:
A formated dictionary expected by output()
- Return type:
compression, dict