ChEMBL_16 Released



We are pleased to announce the release of ChEMBL_16. This version of the database was prepared on 7th May 2013 and contains:

1,481,473 compound records
1,295,510 compounds (of which 1,292,344 have mol files)
11,420,351 activities
712,836 assays
9,844 targets
50,095 documents
19 activity data sources

You can download the data from the ChEMBL ftpsite and do not forget to read the ChEMBL_16 Release Notes

Data changes since the last release
ChEMBL_16 includes the Millipore Kinase Screening publication (CHEMBL2218924), which is kinase screening panel data set focused on 158 known kinase inhibitors and the OSDD Malaria Screening dataset (CHEMBL2113921), which is a set of anti-malarial compounds and bioactivity data provided by the OSDD Malaria consortium
In addition to the our regular publication and dataset updates we are now also loading supplementary bioactivity datasets. In this example the original paper from GSK was published in 2010 (CHEMBL1157114) and with the release of ChEMBL_16 we now provide 2 supplementary datasets (CHEMBL2218064 and CHEMBL2094195). You can see the original paper an supplemenatry datasets in screenshot below (this also demonstrates the new document search functionality we have added to the interface):


We are would like grow our supplementary bioactivity datasets, so please get in touch if you have any similar data you would like to deposit in the ChEMBL database. Stefan Senger from GSK, has put together the following slides, which provide more details on the pros and pros of depositing  supplementary bioactivity data. (Also thanks Derek Lowe over at In The Pipeline for the following blog post).

Interface changes since the last release:
We have made a number changes to the interface which are listed below:
  • Document Search - Submit a keyword search against journal articles and datasets loaded into the database
  • Browse Targets - We have improved the tree browser on protein classification and organism browser targets page 
  • Browse Drugs - Now allows searching on USAN stem and ATC code definitions
  • Updated FAQ pages - see here
  • Target Report Card - Now contains a target relation section, providing links between targets sharing protein components. The target report card also includes links to CREDO and TIMBAL databases
  • Compound Report Card - Includes a link to NCI Resolver service, to retrieve additional synonyms for a compound
In addition to our regular set of downloads (Oracle, MySQL, PostgreSQL) you will also find RDF version on the ChEMBL database. The current version is 16.0 and the files are available to download here. You can expect some minor changes in the RDF between now and the ChEMBL_17 release and these will be represented by increments in the minor version number
The ChEMBL Team