Paper: PPDMs – A resource for mapping small molecule bioactivities from ChEMBL to Pfam-A protein domains


We've just published a Open Access paper in Bioinformatics on an approach to annotate the region of ligand binding within a target protein. This has a lot of applications in the use of ChEMBL, in particular providing greater accuracy in mapping functional effects, improving ligand-based target prediction approaches, and reducing false positives in sequence/target searching of ChEMBL. Where next for this work - well annotating to a site-specific level would be a good thing to implement (think about HIV-1 RT with the distinct nucleoside and non-nucleoside sites).

Here's the abstract...

Summary: PPDMs is a resource that maps small molecule bioactivities to protein domains from the Pfam-A collection of protein families. Small molecule bioactivities mapped to protein domains add important precision to approaches that use protein sequence searches alignments to assist applications in computational drug discovery and systems and chemical biology. We have previously proposed a mapping heuristic for a subset of bioactivities stored in ChEMBL with the Pfam-A domain most likely to mediate small molecule binding. We have since refined this mapping using a manual procedure. Here, we present a resource that provides up-to-date mappings and the possibility to review assigned mappings as well as to participate in their assignment and curation. We also describe how mappings provided through the PPDMs resource are made accessible through the main schema of the ChEMBL database.

Availability: The PPDMs resource and curation interface is available at https://www.ebi.ac.uk/chembl/research/ppdms/pfam_maps

The source-code for PPDMs is available under the Apache license at https://github.com/chembl/pfam_maps

Source code is available at https://github.com/chembl/pfam_map_loader to demonstrate the integration process with the main schema of ChEMBL.