output module
Write the dataset, subsets and related statistics to files and to the command line.
- output.output_stats(df: DataFrame, output_file: str, out: OutputArgs)[source]
Summarise and output the number of unique values in the following columns:
parent_molregno (compound ID)
tid (target ID)
tid_mutation (target ID + mutation annotations)
cpd_target_pair (compound-target pairs)
cpd_target_pair_mutation (compound-target pairs including mutation annotations)
- Parameters:
df (pd.DataFrame) – Pandas Dataframe for which the stats should be calculated
output_file (str) – Path and filename to write the dataset stats to
out (OutputArgs) – Arguments related to how to output the dataset
- output.write_and_check_output(df: DataFrame, filename: str, assay_type: str, args: CalculationArgs, out: OutputArgs)[source]
Write df to file and check that writing was successful.
- Parameters:
df (pd.DataFrame) – Pandas Dataframe to write to output file.
filename (bool) – Filename to write the output to (should not include the file extension)
assay_type (str) – Types of assays current_df contains information about. Options: “BF” (binding+functional), “B” (binding), “all” (contains both BF and B information)
args (CalculationArgs) – Arguments related to how to calculate the dataset
out (OutputArgs) – Arguments related to how to output the dataset
- output.write_debug_sizes(dataset: Dataset, out: OutputArgs)[source]
Output counts at various points during calculating the final dataset for debugging.
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs and debugging sizes.
args (CalculationArgs) – Arguments related to how to calculate the dataset
out (OutputArgs) – Arguments related to how to output the dataset
- output.write_full_dataset_to_file(dataset: Dataset, args: CalculationArgs, out: OutputArgs)[source]
If write_full_dataset, write df_combined with filtering columns to output_path.
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs.
args (CalculationArgs) – Arguments related to how to calculate the dataset
out (OutputArgs) – Arguments related to how to output the dataset
- output.write_output(df: DataFrame, filename: str, out: OutputArgs) list[str] [source]
Write DataFrame df to output file named <filename>.
- Parameters:
df (pd.DataFrame) – Pandas Dataframe to write to output file.
filename (bool) – Filename to write the output to
out (OutputArgs) – Arguments related to how to output the dataset
- Returns:
Returns list of types of files that was written to (csv and/or xlsx)
- Return type:
list[str]