add_chembl_target_class_annotations module

Add target class annotations based on ChEMBL data to the dataset.

add_chembl_target_class_annotations.add_chembl_target_class_annotations(dataset: Dataset, chembl_con: Connection, args: CalculationArgs, out: OutputArgs)[source]

Add level 1 and 2 target class annotations. Assignments for target IDs with more than one target class assignment per level are summarised into one string with ‘|’ as a separator between the different target class annotations.

Targets with more than one level 1 / level 2 target class assignment are written to a file. These could be reassigned by hand if a single target class is preferable.

Parameters:
  • dataset (Dataset) – Dataset with compound-target pairs. Will be updated to only include target class annotations. dataset.target_classes_level1 will be set to pandas DataFrame with mapping from target id to level 1 target class dataset.target_classes_level2 will be set to pandas DataFrame with mapping from target id to level 2 target class

  • chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

  • args (CalculationArgs) – Arguments related to how to calculate the dataset

  • out (OutputArgs) – Arguments related to how to output the dataset

add_chembl_target_class_annotations.get_aggregated_target_classes(dataset: Dataset, chembl_con: Connection) tuple[DataFrame, DataFrame][source]

Get mappings for target id to aggregated level 1 / level 2 target class.

Parameters:
  • dataset (Dataset) – Dataset with compound-target pairs.

  • chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

Returns:

[pandas DataFrame with mapping from target id to level 1 target class, pandas DataFrame with mapping from target id to level 2 target class]

Return type:

tuple[pd.DataFrame, pd.DataFrame]

add_chembl_target_class_annotations.get_target_class_table(chembl_con: Connection, current_tids: set[int]) DataFrame[source]

Get level 1 and level 2 target class annotations in ChEMBL.

Parameters:
  • chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

  • current_tids (set[int]) – Set of target ids to take into account

Returns:

Pandas DataFrame with target class information

Return type:

pd.DataFrame

add_chembl_target_class_annotations.output_ambiguous_target_classes(dataset: Dataset, args: CalculationArgs, out: OutputArgs)[source]

Output targets have more than one target class assignment

Parameters:
  • dataset (Dataset) – Dataset with compound-target pairs.

  • args (CalculationArgs) – Arguments related to how to calculate the dataset

  • out (OutputArgs) – Arguments related to how to output the dataset