get_drug_mechanism_ct_pairs module

Get and add compound-target pairs based on information in the drug_mechanism table.

get_drug_mechanism_ct_pairs.add_annotations_to_drug_mechanisms_cti(chembl_con: Connection, cpd_target_pairs: DataFrame) DataFrame[source]

Add additional information to the compound-target pairs from the drug_mechanisms table to match the information that is present in the compound-target pairs table based on activities.

Parameters:
  • chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

  • cpd_target_pairs (pd.DataFrame) – Pandas DataFrame with compound-target pairs from the drug_mechanism table.

Returns:

Updated pandas DataFrame with the additional annotations.

Return type:

pd.DataFrame

get_drug_mechanism_ct_pairs.add_dm_filtering_columns(dataset: Dataset)[source]
Add filtering columns related to the drug_mechanism table.
  • pair_mutation_in_dm_table: pair is in dm table (incl. mutations)

  • pair_in_dm_table: pair is in dm table (excl. mutations)

  • keep_for_binding: use to limit to binding assays

Parameters:

dataset (Dataset) – Pandas Dataframe with compound-target pairs based on ChEMBL activity data

get_drug_mechanism_ct_pairs.add_drug_mechanism_ct_pairs(dataset: Dataset, chembl_con: Connection)[source]

Add compound-target pairs from the drug_mechanism table that are not in the dataset based on the initial ChEMBL query. These are compound-target pairs for which there is no associated pchembl value data. Since the pairs are known interactions, they are added to the dataset despite not having a pchembl value. Add the set of compound-target pairs in the drug_mechanism table and the set of targets in the drug_mechanism table to the dataset.

Parameters:
  • dataset (Dataset) – Pandas Dataframe with compound-target pairs based on ChEMBL activity data

  • chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

get_drug_mechanism_ct_pairs.get_drug_mechanism_ct_pairs(chembl_con: Connection) DataFrame[source]

Get compound-target pairs from the drug_mechanism table with all the columns that are present in the compound-target pairs based on activities. Relevant mappings of target ids to related target ids are taken into account.

Parameters:

chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

Returns:

Pandas DataFrame with compound-target interactions from the drug_mechanism table.

Return type:

pd.DataFrame

get_drug_mechanism_ct_pairs.get_drug_mechanisms_interactions(chembl_con: Connection) DataFrame[source]

Extract the known compound-target interactions from the ChEMBL drug_mechanisms table. Note: While the interactions are mostly between drugs and targets, the table also includes some known interactions between compounds with a max_phase < 4 and their targets.

Only entries with a disease_efficacy of 1 are taken into account, i.e., the target is believed to play a role in the efficacy of the drug.

disease_efficacy: Flag to show whether the target assigned is believed to play a role in the efficacy of the drug in the indication(s) for which it is approved (1 = yes, 0 = no).

Parameters:

chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

Returns:

Pandas DataFrame with compound-target pairs from the drug_mechanism table with disease relevance.

Return type:

pd.DataFrame

get_drug_mechanism_ct_pairs.get_relevant_tid_mappings(chembl_con: Connection) DataFrame[source]

Get DataFrame with mappings from target id to their related target ids based on the target_relations table. The following mappings are considered:

protein family

-[superset of]->

single protein

protein complex

-[superset of]->

single protein

protein complex group

-[superset of]->

single protein

single protein

-[equivalent to]->

single protein

chimeric protein

-[superset of]->

single protein

protein-protein interaction

-[superset of]->

single protein

These mappings can be used to increase the number of target ids for which there is data in the drug_mechanisms table. For example, for protein family -[superset of]-> single protein this means: If there is a known relevant interaction between a compound and a protein family, interactions between the compound and single proteins of that protein family are considered to be known interactions as well.

Parameters:

chembl_con (sqlite3.Connection) – Sqlite3 connection to ChEMBL database.

Returns:

Pandas DataFrame with mappings from tid to related tid for the defined subset of target relations.

Return type:

pd.DataFrame