Output

Utility function to compress output data

pycytominer.cyto_utils.output.check_compression_method(compression: str)

Ensure compression options are set properly

Parameters:

compression (str) – The category of compression options available

Returns:

Asserts available options

Return type:

None

pycytominer.cyto_utils.output.output(df: DataFrame, output_filename: str, output_type: Literal['csv', 'parquet', 'anndata_h5ad', 'anndata_zarr', None] = 'csv', sep: str = ',', float_format: str | None = None, compression_options: str | dict[str, Any] | None = {'method': 'gzip', 'mtime': 1}, **kwargs) str

Given an output file and compression options, write file to disk

Parameters:
  • df (pandas.core.frame.DataFrame) – a pandas dataframe that will be written to file

  • output_filename (str) – location of file to write

  • output_type (str, default "csv") – type of output file to create

  • sep (str) – file delimiter

  • float_format (str, default None) – Decimal precision to use in writing output file as input to pd.DataFrame.to_csv(float_format=float_format). For example, use “%.3g” for 3 decimal precision.

  • compression_options (str or dict, default {"method": "gzip", "mtime": 1}) – Contains compression options as input to pd.DataFrame.to_csv(compression=compression_options). pandas version >= 1.2.

Returns:

returns output_filename

Return type:

str

Examples

import pandas as pd
from pycytominer.cyto_utils import output

data_df = pd.concat(
    [
        pd.DataFrame(
            {
                "Metadata_Plate": "X",
                "Metadata_Well": "a",
                "Cells_x": [0.1, 0.3, 0.8],
                "Nuclei_y": [0.5, 0.3, 0.1],
            }
        ),
        pd.DataFrame(
            {
                "Metadata_Plate": "X",
                "Metadata_Well": "b",
                "Cells_x": [0.4, 0.2, -0.5],
                "Nuclei_y": [-0.8, 1.2, -0.5],
            }
        ),
    ]
).reset_index(drop=True)

output_file = "test.csv.gz"
output(
    df=data_df,
    output_filename=output_file,
    sep=",",
    compression_options={"method": "gzip", "mtime": 1},
    float_format=None,
)
pycytominer.cyto_utils.output.set_compression_method(compression: str | dict | None) dict[str, Any]

Set the compression options

Parameters:

compression (str or dict) – Contains compression options as input to pd.DataFrame.to_csv(compression=compression_options). pandas version >= 1.2.

Returns:

A formated dictionary expected by output()

Return type:

compression, dict