add_chembl_target_class_annotations module
Add target class annotations based on ChEMBL data to the dataset.
- add_chembl_target_class_annotations.add_chembl_target_class_annotations(dataset: Dataset, chembl_con: Connection, args: CalculationArgs, out: OutputArgs)[source]
Add level 1 and 2 target class annotations. Assignments for target IDs with more than one target class assignment per level are summarised into one string with ‘|’ as a separator between the different target class annotations.
Targets with more than one level 1 / level 2 target class assignment are written to a file. These could be reassigned by hand if a single target class is preferable.
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs. Will be updated to only include target class annotations. dataset.target_classes_level1 will be set to pandas DataFrame with mapping from target id to level 1 target class dataset.target_classes_level2 will be set to pandas DataFrame with mapping from target id to level 2 target class
chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.
args (CalculationArgs) – Arguments related to how to calculate the dataset
out (OutputArgs) – Arguments related to how to output the dataset
- add_chembl_target_class_annotations.get_aggregated_target_classes(dataset: Dataset, chembl_con: Connection) tuple[DataFrame, DataFrame] [source]
Get mappings for target id to aggregated level 1 / level 2 target class.
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs.
chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.
- Returns:
[pandas DataFrame with mapping from target id to level 1 target class, pandas DataFrame with mapping from target id to level 2 target class]
- Return type:
tuple[pd.DataFrame, pd.DataFrame]
- add_chembl_target_class_annotations.get_target_class_table(chembl_con: Connection, current_tids: set[int]) DataFrame [source]
Get level 1 and level 2 target class annotations in ChEMBL.
- Parameters:
chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.
current_tids (set[int]) – Set of target ids to take into account
- Returns:
Pandas DataFrame with target class information
- Return type:
pd.DataFrame
- add_chembl_target_class_annotations.output_ambiguous_target_classes(dataset: Dataset, args: CalculationArgs, out: OutputArgs)[source]
Output targets have more than one target class assignment
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs.
args (CalculationArgs) – Arguments related to how to calculate the dataset
out (OutputArgs) – Arguments related to how to output the dataset