• VEHICLe - virtual exploratory heterocyclic library

    An interesting and thought provoking paper from last year was 'Heteroaromatic Rings of the Future' by Will Pitt (of UCB) (subscription required) and others at UCB. The basic idea of the paper was to exhaustively identify then analyse the class of all possible heterocycles with the following constraints. i) mono and bicyclic rings, ii) Only 5 and 6 membered rings, iii) Only containing C, N, O, S and H, iv) neutral, v) obey Hückel’s 4n+2 rule of aromaticity , and vi) Only exocyclic carbonyls. Heterocycles like this are at the very core of drug discovery and medicinal chemistry.

    The dataset is now available for download from the chembl ftp site, and also as a Google document

    The file contains...

    • regid: the id for each distinct ring system
    • SMILES: the encoded chemical structure of each ring system
    • Training dataset hits: the count of substructure hits found in the
      original search of commercial compound catalogues, drugs etc. (as reported in the paper).
    • Beilstein hits: the count of substructure hits in the Beilstein
      database at that time (June 2008). Some fields are blank - searching with benzene
      and other common ring systems would have taken too long.
    • Pgood: predicted synthetic tractability after training with both the
      above datasets
    • Tautomer cluster: tautomeric equivalents are grouped into clusters

    Will can be contacted at will.pitt (at) ucb.com for a free reprint of the paper, or more discussions of the work.

    We will integrate the VEHICLe ring system regids into Chembl at some point in the future.

    %T Heteroaromatic Rings of the Future
    %A W.R. Pitt
    %A D.M. Parry
    %A B.G. Perry
    %A C.R. Groom
    %J J. Med. Chem.
    %D 2009
    %V 52
    %P 2952-2963
    %O VEHICLe
    
  • Position in Computational Biology at the Institute of Cancer Research

    ChEMBL-og readers may be interested in a position at the world-leading Institute of Cancer Research in Sutton. It is for a 'Higher Scientific Officer', with the job involving primarily programming and bio/chemical data integration, the post is within the Computational Biology and Chemogenomics team.
    For more information see the ICR web site.

    Closing date for applications is 7th May 2010.

  • Innovative Medicines Initiative project - eTox


    Drug development necessitates running in vivo toxicological studies for the assessment of potential untoward side effects. Toxicities may often limit the use of medicines, and sometimes prevent molecules to become drugs. Early selection of chemicals with a low probability of being toxic will improve the whole process, taking less time and resources, including the use of animals. Hence, early in silico prediction of in vivo toxicological results would increase the efficiency of the drug development process and reduce the number of animals to be used in preclinical studies.


    The eTOX project aims to develop innovative methodological strategies and novel software tools to better predict the toxicological profiles of new molecular entities in early stages of the drug development pipeline. This is planned to be achieved by sharing and jointly exploiting legacy reports of toxicological studies from participating pharmaceutical companies The project will coordinate the efforts of specialists from industry and academia in the wide scope of disciplines that are required for a more reliable modelling of the complex relationships existing between molecular and in vitro information and the in vivo toxicity outcomes of drugs. The proposed strategy includes a synergetic integration of innovative approaches in the following areas:


  • Data sharing of previously unaccessible high quality data from toxicity legacy reports of the pharma companies.
  • Database building and management, including procedures and tools for protecting sensitive data.
  • Ontology and text mining techniques, with the purpose of facilitating knowledge extraction from legacy preclinical reports and biomedical literature.
  • Chemistry and structure-based approaches for the molecular description of the studied compounds, as well as of their interactions with the anti-targets responsible for the secondary pharmacologies.
  • Prediction of DMPK (Drug Metabolism and Pharmacokinetics) features since they are often related to the toxicological events.
  • Systems biology approaches in order to cope with the complex biological mechanisms which govern in vivo toxicological events.
  • Computational genomics and sophisticated statistical analysis tools required to derive multivariate QSAR models
  • Development and validation (according to the OECD principles) of QSARs, integrative models, expert systems and meta-tools.


    The eTOX project will be carried out by a Consortium comprising 25 organisations (13 pharmaceutical companies, 7 academic groups (including EMBL-EBI) and 5 SMEs) with complementary expertises. The total budget of the project is 13 million Euro and the project will last for five years.

    The website for the project is http://www.e-tox.net/

  • ChEMBLdb interface demo - 3pm GMT, 29th March 2010

    We will host a web-meeting demonstration of the chembldb web-interface on Monday 29th March at 3pm GMT. If you are interested in joining, please email us on this link for dial-in details. We will be using the webhuddle software for the demo so it may be worth trying it out on your machine beforehand.

  • ChEMBLdb schema walkthrough - 3pm GMT, Tuesday 30th March 2010

    We will host another web-meeting walkthrough of the chembldb core database schema on Tuesday 30th March at 3pm GMT. If you are interested in joining, please email us on this link for dial-in details. We will be using the webhuddle software for the demo so it may be worth trying it out on your machine beforehand.

  • Hyper-panoramic pictures of the EMBL-EBI

    The world is a really small place, as these pictures of the EBI reveal. Many thanks to the photographer Tim Nugent.

  • EMBL-EBI SME forum

    The EMBL-EBI have an active support forum for SMEs, if you are interested in joining, see details here

    The picture above is of a gribble - a really cute sea animal, that nonetheless strikes fear into the heart of seafarers everywhere.

  • Kinase inhibitor patents over time....

    Here is a plot of the number of WO published patents that contain the words 'kinase' and 'inhibitor' published during a particular year - these were found using the excellent SIMPLE patent system. Of course this is not the same as a hand crafted, expert search of the protein kinase inhibitor patent field, but all I wanted to do here was to see the pattern over time of publication dates.

    As can be seen, there is a sharp ramp-up in published patents over the period 1997-2001 - which will correspond to a filing period of 1995-2000 (patents are typically published 18 months after filing). From around 2003 onwards the total number of published patents per year have been relatively constant.