Collate

Collate CellProfiler CSVs into SQLite files using cytominer-database

pycytominer.cyto_utils.collate.collate(batch: str, config: str, plate: str, base_directory: str = '../..', column: str | None = None, munge: bool = False, csv_dir: str = 'analysis', aws_remote: str | None = None, aggregate_only: bool = False, tmp_dir: str | None = None, overwrite: bool = False, add_image_features: bool = True, image_feature_categories: list[str] | None = ['Granularity', 'Texture', 'ImageQuality', 'Threshold'], printtoscreen: bool = True)

Collate the CellProfiler-created CSVs into a single SQLite file by calling cytominer-database

Parameters:
  • batch (str) – Batch name to process

  • config (str) – Config file to pass to cytominer-database

  • plate (str) – Plate name to process

  • base_directory (str, default "../..") – Base directory for subdirectories containing CSVs, backends, etc; in our preferred structure, this is the “workspace” directory

  • column (str, optional, default None) – An existing column to be explicitly copied to a new column called Metadata_Plate if no Metadata_Plate column already explicitly exists

  • munge (bool, default False) – Whether munge should be passed to cytominer-database, if True cytominer-database will expect a single all-object CSV; it will split each object into its own table

  • csv_dir (str, default 'analysis') – The directory under the base directory where the analysis CSVs will be found. If running the analysis pipeline, this should nearly always be “analysis”

  • aws_remote (str, optional, default None) – A remote AWS prefix, if set CSV files will be synced down from at the beginning and to which SQLite files will be synced up at the end of the run

  • aggregate_only (bool, default False) – Whether to perform only the aggregation of existent SQLite files and bypass previous collation steps

  • tmp_dir (str, optional) – The temporary directory to be used by cytominer-databases for output. If not provided, the system temporary directory is used.

  • overwrite (bool, optional, default False) – Whether or not to overwrite an sqlite that exists in the temporary directory if it already exists

  • add_image_features (bool, optional, default True) – Whether or not to add the image features to the profiles

  • image_feature_categories (list, optional, default ['Granularity','Texture','ImageQuality','Count','Threshold']) – The list of image feature groups to be used by add_image_features during aggregation

  • printtoscreen (bool, optional, default True) – Whether or not to print output to the terminal

pycytominer.cyto_utils.collate.run_check_errors(cmd: list[str]) None

Run a system command, and exit if an error occurred, otherwise continue