Papers: Database of all (reasonable) possible molecules.

Jean-Louis Reymond's group in the Department of Chemistry and Biochemistry at the University of Berne, Switzerland have recently published a paper on their GDB-13 database. This contains all reasonable and likely stable and synthetically accessible molecules that contain up to 13 heavy atoms (restricted to the well tolerated drug like atoms of C, N, O, S and Cl) (I guess the text hackers amongst you could expand this to include F at the drop of a perl function). There are 977,468,314 molecules in GDB-13, which is as near as damn it, one billion.

To put this number in some perspective, if the same constraint on composition was applied to natural peptides, there would be only 66 possible molecules (one distinct tripeptide, 46 distinct dipeptides and 19 single amino acids). It would be interesting to push this toy analysis a little further - what fraction of available (13 atom reasonable) chemical space do natural human metabolites occupy, are there any interesting patterns, etc.?

This database is really exciting.

%A L.C. Blum
%A J.-L. Reymond
%J J. Am. Chem. Soc.
%D 2009
%V 131
%N (25)
%P 8732-8733
%O DOI:10.1021/ja902302h