• A case for small-scale plagiarism?


    An odd title for a post, but hopefully you'll get the point! One of the big problems in our work of curating data from the literature is one of comparing bioassays between papers - there are just so many different possible ways of describing the same thing that making useful links between comparable or incomparable data is very difficult. So for methods sections, wouldn't it be helpful to be able to freely cut and paste text from the works of others describing the protocol for a bioassay (with attribution/citation of course) to allow the linking of data. Of course, this would not be of much help without access to full text, but imagine the future, where links can be established between research on the basis of methodological similarity.

  • Bioequivalent molecules - aldehydes, ketones, epimers, and all that stuff



    Following on from the post on oxidized/reduced forms of molecules there are a few more cases of small that may be capturable in a useful, practical and potentially programmatic way - no plans for incorporation of these in any forthcoming release of ChEMBL!, but they are useful examples to be aware of if you're planning to look at the activity of molecules either in vitro or in vivo and the molecule you think you have may actually exist in alternate chemically distinct forms that are responsible for the bioactivity and can't be equivalenced using standard inChIs.

    1) Aldehydes and activated ketones

    Simple unhindered aldehydes, and many activated ketones exist in aqueous solution in equilibrium with 'hydrated' forms, the carbonyl carbon changes from sp2 to sp3. The hydrated gem-diol form is responsible for the bioactivity in many cases - for example in fluoroketone aspartic proteinase inhibitors. This interconversion will affect all bioassays performed in water, and that is about everything in biology, so it is significant for both in vivo and an in vitro assays.

    This is similar to the sort of equilibrium that occurs within sugars, but usually the equilibrium is way over to the side of the cyclised form - there's another complexity for sugars too (see the epimer section below).

    2) Prodrugs and metabolites

    This will affect the activity of a molecule in vivo only (or maybe in an ex vivo situation where enzymes capable of metabolism occur). In this case the conversion is not usually in equilibrium, and is irreversible. However, where this affect occurs, it can be confusing, since the observed bioactivity is not linked to the original molecule. Some classes of prodrugs are relatively easy to spot computationally, for example simple alkyl esters, where it is reasonable to propose that where you dose a simple methyl/ethyl ester of a compound then the parent acid will also get significant exposure, and may well be responsible for any observed bioactivity. However, not all classes of pro-drugs are so easy to identify/predict, and for these an annotation approach may well be appropriate.

    A second subset of this case is where there is some oxidative metabolism (e.g. p450 mediated) and then this metabolite has some independent/distinct bioactivity. There are many complexities in predicting metabolite identity and estimating levels, but it does happen, and can continue to surprise, especially in a safety pharmacology or toxicology setting.

    3) Epimers and epimerization

    This can affect both in vitro and in vivo assays, but more often apply in an in vivo setting. Chirality is often really important for bioactivity, with big splits in activity often seen between enantiomers - due to the fundamental fact that biological receptors are (almost) invariably chiral. Usually stereocenters are stable and preserved once a molecule is synthethised and purified - however many molecules epimerise, and spontaneously establish an equilibrium between the two forms - examples include alpha and beta forms of sugars (where the special term anomers is used), and also some drugs - the most infamous of these is thalidomide. Yet again another setting where a conventional treatment of molecular structure has some shortcomings in understanding a bioassay.

    So a couple of examples, and maybe the seed of thinking about how to represent linked/bioequivalent forms of molecules at a higher level than that achieved with Standard InChIs.

    The picture above came from http://www.colby.edu/chemistry/CH242/15-2.pdf

  • 2nd RDKit UGM, 2-4 October 2013



    We are very happy to announce the 2nd RDKit User Group Meeting. The meeting will take place October 2nd-4th here the Genome Campus in Hinxton, UK. We're using a different format for the meeting this year:

    Days 1 and 2: Talks, lightning talks, roundtable(s), discussion, and something new: talktorials! Talktorials are somewhere between a talk and a tutorial, they cover something interesting done with the RDKit and include the code used to do the work. During the presentation you'll give an overview of what you did and also show the pieces of the code that are central to the work. The idea is to mix the science up with the tutorial aspects.

    Day 3 will be the first ever RDKit sprint: those who choose to stay will spend an intense day working in small groups to produce useful artifacts: new bits of code, knime nodes, knime workflows, tutorials, documentation, IPython notebooks, etc. We'll see who's there and what folks are interested in contributing and go from there.

    There will also be, of course, social and networking activities!

    Registration is free at the following link: http://rdkitugm2.eventbrite.co.uk/

    We are now looking for people who are willing to do presentations or talktorials on the first two days. If you're interested in contributing, please send us an email. Lightning talks don't need to be arranged too far in advance; we'll start collecting the list of people interested in doing those shortly before the event.

    We are really looking forward to seeing a bunch of you again, to meet some new people from the ever growing RDKit developer and user community, and to hear some more cool stories about what people do with the RDKit.


    Greg and George

  • New Drug Approvals 2013 - Pt. V - Canagliflozin (INVOKANA™)



    ATC Code: A10BX (incomplete)
    Wikipedia: Canagliflozin
    ChEMBL: CHEMBL2048484

    On March 29th the FDA approved Canagliflozin (trade name INVOKANA™) to improve glycemic control for the treatment of diabetes type 2. Canagliflozin is to be used in combination with proper diet and exercise. Canagliflozin is a subtype 2 sodium-glucose transport protein (SGLT2, ChEMBL3884) inhibitor. Canagliflozin is a first-in-class drug with several others still in clinical trials

    Target
    SGLT2 is found in the proximal tubule of the nephron in the kidneys (as is paralog SGLT1, ChEMBL4979). SGLT2 one of the 5 known members of the sodium-glucose transporter proteins family. The transporter is responsible for 90 % of the total renal glucose reuptake (corresponding to 98 % of the uptake in the proximal convoluted tubule). The protein has a relatively low affinity for glucose compared to SGLT1 (2 mM versus 0.4 mM) but a higher capacity. Hence inhibition of this protein leads to a lowering of the glucose plasma concentration. SGLT2 is a 672 amino acid protein which can be found on Uniprot (P31639). The most similar PDB structure is the sodium/glucose costransporter from Vibrio parahaemolyticus (3DH4). 

    The paralog SGLT1 (664 amino acids, 57.63% identical to SGLT2) is also found in the intestine where it is responsible for glucose uptake. Hence SGLT1 forms an important anti-target for Canagliflozin. 

    Structure
    Canagliflozin (CHEMBL2048484 ; Chemspider : 26333259 ;  Pubchem : 125299338 ; Unichem Identifier 1075025) is a small molecule drug with a molecular weight of 444.5 Da, an AlogP of 3.45, 5 rotatable bonds and does not violate the rule of 5.

    Canonical SMILES : Cc1ccc(cc1Cc2ccc(s2)c3ccc(F)cc3)[C@@H]4O[C@H](CO)[C@@H](O)[C@H](O)[C@H]4O

    InChi: InChI=1S/C24H25FO5S/c1-13-2-3-15(24-23(29)22(28)21(27)19(12-26)30-24)10-16(13)11-18-8-9-20(31-18)14-4-6-17(25)7-5-14/h2-10,19,21-24,26-29H,11-12H2,1H3/t19-,21-,22+,23-,24+/m1/s1

    Contra-indications
    Canagliflozin is contra-indicated when there is a history of serious hypersensitivity reactions to Canagliflozin or in cases of severe renal impairment, ESRD, or on dialysis.

    Dosage
    The recommended starting dose of Canagliflozin is 100 mg once daily, taken before the first meal of the day. The dose can be increased to 300 mg once daily in patients tolerating Canagliflozin. 100 mg should be dosed once daily who have an eGFR of 60 mL/min/1.73 m2 or greater and require additional glycemic control. Canagliflozin is limited to 100 mg once daily in patients who have an eGFR of 45 to less than 60 mL/min/1.73 m2.Canagliflozin should be discontinued if eGFR falls below 45 mL/min/1.73 m2.

    Metabolism
    O-glucuronidation is the major metabolic elimination pathway for canagliflozin, which is mainly glucuronidated by UGT1A9 and UGT2B4 to two inactive O-glucuronide metabolites. CYP3A4-mediated (oxidative) metabolism of canagliflozin is minimal (approximately 7%) in humans.

    Excretion
    Following administration of a single oral [14C]canagliflozin dose to healthy subjects, 41.5%, 7.0%, and 3.2%  of the administered radioactive dose was recovered in feces as canagliflozin, a hydroxylated metabolite, and an O-glucuronide metabolite, respectively. Enterohepatic circulation of canagliflozin was negligible. Approximately 33% of the administered radioactive dose was excreted in urine, mainly as O-glucuronide metabolites (30.5%). Less than 1% of the dose was excreted as unchanged canagliflozin in urine. Renal clearance of canagliflozin 100 mg and 300 mg doses ranged from 1.30 to 1.55 mL/min. Mean systemic clearance of canagliflozin was approximately 192 mL/min in healthy subjects following intravenous administration.

    The license holder is Janssen Pharmaceuticals, Inc. and the full prescribing information can be found here.

  • Book - An introduction to Medicinal Chemistry



    There is a new edition of the great med. chem. book "Introduction to Medicinal Chemistry" by Graham Patrick out - I picked up a copy at the Oxford University Press stand at the ACS, and am now flicking through it recovering from the flight from Miami.

    Here's a link to the book at Amazon (UK store).


    %A G. L. Patrick
    %T An Introduction To Medicinal Chemistry
    %D 2013
    %I Oxford University Press
    %O 5th Edition
    %O ISBN: 978-0-19-969739-7
    
    
    
    I worked on some new slides while away, and I think there are a few interesting things to blog about over the next few weeks.

  • Conference: ICBS2013 - October, Shirankaikan, Japan


    The 2013 meeting of the International Chemical Biology Society (ICBS) will be held in Shirankaiken, Japan from October 7th to 9th 2013. Details of the meeting are here.

  • USAN Watch: April 2013

    The USANs for April 2013 have recently been published.
    USAN Research Code InChIKey (Parent)Drug ClassTherapeutic classTarget
    aftobetin, aftobetin hydrochlorideANCA-11, NCE-11synthetic small moleculeimaging agentn/a
    anifrolumabMEDI-546n/amAbtherapeuticinterferon a/b receptor
    cobimetinib fumarateRG-7420, GDC-0973, XL-518synthetic small moleculetherapeuticMEK
    doravirineMK-1439synthetic small moleculetherapeuticHIV-1 RT
    lambrolizumabMK-3475n/amAbtherapeuticPDCD-1
    pelareorepReolysin, PO-BB-0209n/avirustherapeuticn/a
    plazomicin sulfateACHN-490natural product derived small moleculetherapeutic30S ribosome
    ralimetinib, ralimetinib mesylate
    LY-2228820 
    synthetic small moleculetherapeuticMAPK14
    ricolinostat
    ACY-1215, ACY-63 
    synthetic small moleculetherapeuticHDAC
    ubrogepantMK-1602synthetic small moleculetherapeuticCGRP
    valbenazine
    NBI-98854 
    synthetic small moleculetherapeuticVMAT2
    velcalcetide hydrochloride
    KAI-4169 
    n/apeptidetherapeuticCaSr

  • ChEMBL iPad App from Dotmatics in the iTunes Store


    A local SME - Dotmatics - has created a really cool iPad app containing all of ChEMBL 15 in chemically searchable form - chemical structure and property data is local to the iPad, and bioactivity data is then retrieved over the web for selected compounds. Here's a link to details of the app.