Removal of Metal-Containing Compounds



Further to my post a few months ago (To Remove or Not to Remove) about removing certain problem metal-containing compounds, we have now come up with a plan of what to do.
Instead of labeling this curation as ‘removal of inorganics’, or ‘removal of organometallics’, we simply want this to be known as ‘removal of some metal-containing compounds’.

The criterion that we used was to exclude a large proportion of compounds that contained a metal, apart from cases where a metal was commonly found as part of a pharmaceutical preparation (e.g. Ranitidine Bismuth Citrate CHEMBL2111286, Silver Sulfadiazine CHEMBL1382627, Bacitracin Zinc CHEMBL2096639). The reasoning behind the removal of such compounds was that most of these metals are bonded to the rest of the compound components via coordinate bonds. However, due to InChI limitations, there is no way of creating a Standard InChI that retains coordinate bond information. As we use Standard InChI as the main compound identifier of uniqueness in ChEMBL, it was decided to exclude the structures altogether.

This change will come into effect with the release of ChEMBL_17, and only affects ~3,200 compounds. The compound image on the interface will be replaced with an icon that shows it’s a metal-containing compound (see picture, above). The structures will not be part of the download set on the FTP site, but we will retain the molecular formula in both the downloads and on the ChEMBL interface, so that you can still see the elemental make up of the compound. We will, of course, retain all of the bioactivity data on these compounds.

Any questions, please feel free to contact chembl-help@ebi.ac.uk