output module

Write the dataset, subsets and related statistics to files and to the command line.

output.output_stats(df: DataFrame, output_file: str, out: OutputArgs)[source]

Summarise and output the number of unique values in the following columns:

  • parent_molregno (compound ID)

  • tid (target ID)

  • tid_mutation (target ID + mutation annotations)

  • cpd_target_pair (compound-target pairs)

  • cpd_target_pair_mutation (compound-target pairs including mutation annotations)

  • df (pd.DataFrame) – Pandas Dataframe for which the stats should be calculated

  • output_file (str) – Path and filename to write the dataset stats to

  • out (OutputArgs) – Arguments related to how to output the dataset

output.write_and_check_output(df: DataFrame, filename: str, assay_type: str, args: CalculationArgs, out: OutputArgs)[source]

Write df to file and check that writing was successful.

  • df (pd.DataFrame) – Pandas Dataframe to write to output file.

  • filename (bool) – Filename to write the output to (should not include the file extension)

  • assay_type (str) – Types of assays current_df contains information about. Options: “BF” (binding+functional), “B” (binding), “all” (contains both BF and B information)

  • args (CalculationArgs) – Arguments related to how to calculate the dataset

  • out (OutputArgs) – Arguments related to how to output the dataset

output.write_debug_sizes(dataset: Dataset, out: OutputArgs)[source]

Output counts at various points during calculating the final dataset for debugging.

  • dataset (Dataset) – Dataset with compound-target pairs and debugging sizes.

  • args (CalculationArgs) – Arguments related to how to calculate the dataset

  • out (OutputArgs) – Arguments related to how to output the dataset

output.write_full_dataset_to_file(dataset: Dataset, args: CalculationArgs, out: OutputArgs)[source]

If write_full_dataset, write df_combined with filtering columns to output_path.

  • dataset (Dataset) – Dataset with compound-target pairs.

  • args (CalculationArgs) – Arguments related to how to calculate the dataset

  • out (OutputArgs) – Arguments related to how to output the dataset

output.write_output(df: DataFrame, filename: str, out: OutputArgs) list[str][source]

Write DataFrame df to output file named <filename>.

  • df (pd.DataFrame) – Pandas Dataframe to write to output file.

  • filename (bool) – Filename to write the output to

  • out (OutputArgs) – Arguments related to how to output the dataset


Returns list of types of files that was written to (csv and/or xlsx)

Return type:
