Further to my post a few months ago (To Remove or Not to Remove) about removing certain problem metal-containing compounds, we have now come up with a plan of what to do.
Instead
of labeling this curation as ‘removal of inorganics’, or ‘removal of
organometallics’, we simply want this to be known as ‘removal of some
metal-containing compounds’.
The criterion that we used was to
exclude a large proportion of compounds that contained a metal, apart
from cases where a metal was commonly found as part of a pharmaceutical
preparation (e.g. Ranitidine Bismuth Citrate
CHEMBL2111286, Silver Sulfadiazine
CHEMBL1382627, Bacitracin Zinc
CHEMBL2096639). The
reasoning behind the removal of such compounds was that most of these
metals are bonded to the rest of the compound components via coordinate bonds. However, due
to
InChI limitations, there is no way of creating a Standard InChI that
retains coordinate bond information. As we use Standard InChI as the
main compound identifier of uniqueness in ChEMBL, it was decided to
exclude the structures altogether.
This change will come into effect with the release of ChEMBL_17, and only affects ~3,200 compounds. The compound image on the interface will be replaced with an icon that shows it’s a metal-containing compound (see picture, above). The structures will not be part of the download set on the FTP site, but we will retain the molecular formula in both the downloads and on the ChEMBL interface, so that you can still see the elemental make up of the compound. We will, of course, retain all of the bioactivity data on these compounds.