• Something for the weekend?

    Really a part 2 post to the Starlite interface post of a few weeks ago. Not many words, just pictures, which should be pretty self explanatory. So to Bissan and Mark, a big thank you for continuing to share, and may the Gods of Grants bless you with bountious funding!

    The picture above is a really good joke (sorry source unknown, it was just on my disk filed away), it took me ages to spot, but when I did, boy did I laugh - OUT LOUD.

    Do a sequence search...

    Select sequences from the blastp hitlist....

    Select what end-points you want to look at....

    Apply some filters to these....

    Et voila!

  • You might also like.....

    Having just moved to a new office, it is, of course, the best time to sort through all the stuff you have just moved to decide what to keep (note to self - next time throw out rubbish before moving). I found a whole pile of old ideas, sketches, hard-copy presentations that no longer exist in digital form, and of course most of it/all of it is rubbish.

    One thing I reflected on though, was a scribbled note on an "Amazon for SAR data" (unsurprisingly, the page had a list of potential names, the strongest (relatively) was SARmazon). The basic idea was to use search preferences to suggest jumps to other data that may be of interest, essentially just a series of links between papers (orderable by date, number of compounds, etc.) The links would be established by compound similarity and target/assay similarity. More specifically you would use the shared occurrence of compounds, or targets, to build the association list and weights. It would be a little like the systems that on-line shops use to suggest purchases, based on what others (estimated as having similar tastes to you) have purchased. Its just here you use compounds as the objects in the store, and the papers/patents as the purchasers. Anyway, having just read this explanation, it is not really clear, and an example may make it so.

    Anyway, I have just revisited this idea, using our latest StARlite data, and it looks quite interesting (for some cases, and uninformative for others, but hey, sometimes Amazon suggests the Pet Shop Boys, and sometimes Franz Ferdinand - PSB everytime!). Watch this space....

  • StARlite Schema Walkthrough

    So, here is another StARlite schema walkthrough (barring any unforeseen circumstances). Wednesday 8th April, 2009 at 2pm UK local time, which is now BST. It will take an hour. Please mail me, if you are interested in getting the weblink. You will need to call a UK telephone, so please bear this in mind.

    The image is of the Starlite rooms cocktail bar in Tujunga Village, Los Angeles. I have not visited there (yet), but given Tujunga's Utopian Socialist roots, it seems a mighty fine bar to have a drink in.

  • Paper of the year?

    Two simple words Robot Scientist.

    %T The Automation of Science
    %A Ross D. King
    %A Jem Rowland
    %A Stephen G. Oliver
    %A Michael Young
    %A Wayne Aubrey
    %A Emma Byrne
    %A Maria Liakata
    %A Magdalena Markham
    %A Pinar Pir
    %A Larisa N. Soldatova
    %A Andrew Sparkes
    %A Kenneth E. Whelan
    %A Amanda Clare
    %J Science
    %D 2009
    %V 324
    %P 85-89
    

  • Bioisostere Discovery

    Here is an old, old use case we developed for StARlite, this one looks at using data contained within StARlite to discover bioisosteres - a functional group replacement that preserves activity while improving other properties, such as metabolism, patentability, solubility etc. The algorithm exploits the useful 'data structure' of StARlite, in that compounds are typically entered in the literature/database as clusters of synthetically related compounds (i.e. they typically share late stage intermediates in their production), and therefore there are often reasonably straightforward ways to synthetically access these related compounds. Secondly, again because of the structure of the data, there are often equivalent assays to compare (same assays, done under the same conditions, by the same people), and so this removes one important variable from any further analysis (this is performed using the simple heuristic of only comparing quantitative data from the same StARlite doc_id).

    Here is some (truly appalling, almost prose it has been noted) pseudocode, in which one wants to find possible replacements for a particular Functional Group (for example, a nitro, a vinyl halide, a sulphonamide, etc.)

    1. Search StARlite for the all examples of the Functional Group
    2. Identify all fragments that these Functional Groups are attached to (call these 'Contexts')
    3. Search StARlite for all Contexts, then identify the corresponding Replacement Functional Groups
    4. Build a table of Replacement Functional Groups and the count the frequency of each type of interchange (this frequency list is pretty useful in its own right)
    5. Retrieve quantitative values of binding energy difference (using endpoints such as IC50, Ki, Kd, etc., constraining the comparison to the same assay_ids from the same doc_ids
    6. Use these binding energy differences to compute an expectation value for the binding energy difference between the Functional Group and the Replacement Functional Group
    

    So a good bioisostere would preserve (or improve) binding energy, these are then pretty easy to identify from the tables generated above. Of course, with the multiple end points stored in StARlite, and the generality of the approach, the same basic workflow can be used to identify functional group replacements that can improve half-life, solubility, logD, etc., etc.

    Here is an old slide of a real case, the replacement of a carboxylic acid with other functional groups. Hopefully, with the background above, the figure is self explanatory....

    The picture used in the header of the post is from the excellent and very amusing B'eau Bo D'Or blog, and I think perfectly illustrates bioisosterism - albeit in a context that is completely opaque to anyone not steeped in the 70's and 80's popular culture of the United Kingdom.

  • Bio-IT World (Europe) Conference, Hannover, October 2009

    We are going to speak in the Data Integration and Knowledge Management track at the Bio-IT World (Europe) meeting to be held in the beautiful city of Hannover, Germany, October 5th to 7th 2009. Should be a good meeting...

  • We Now Have An Office!

    The ChEMBL caravan will always have a place in our hearts, but now we have an office, and we must move on from the pain. It has some walls and a door, with a nice hook for jackets and coats. It is nice, bijou even, and has bought a smile to all our faces. Most importantly it gives us a place to entertain guests and visitors, so if anyone is in the area, please pop by and have a cup 'o tea and a slice 'o cake.

    As a sideline, to fund the tea and cakes, we have a nice T-shirt - XXXL only, minimum order 10 pieces if you are interested.

  • Hit-the-sack - Pt. IX - First Hotel Linné, Uppsala

    How many of the world's greatest scientists have come from Uppsala? - well Anders Celsius, Anders Jonas Ångström, Svante Arrhenius, Jöns Jakob Berzelius, Theodor Svedberg, Arne Tiselius and Carl Linneaus, that's who. Wow! Lots and lots to see and do connected with science, so easy and tempting to have a holiday and browse lots of science/nature things. A real shame I had limited time there, but Uppsala has been added to my list of places for a proper holiday.

    I wonder what Linneaus would have made of the tasty sea cucumber above?

    Hotel Web Site

    An overall score of 55%, let down by quality of room fittings and internet speed.

  • Room Quality - 6/10 OK Room, beds are quite lightweight, and furniture has clearly been used by many guests before me. There is one of those cool continental lifts though, the ones that the wall appears to move in front of you when you close the door. Excellent helpful staff.
  • Getting There - 5/10 About a £35 cab ride from Arlanda, nice quick roads and takes about 20 minutes. Expensive though.
  • Cost - 5/10 - ca. £100 a night, does include a good breakfast.
  • Phone reception - 8/10 - Good reliable signal, but it is a city center location....
  • Internet - 5/10 - Free internet for one day, not very quick at all though, and quite a few drop outs.
  • Conference facilities - n/a. I don't think they have anything.
  • Mushroom factor - 4/10 - Good potential, but snow on the ground when I visited.