USAN | Research Code | InChIKey (Parent) | Drug Class | Therapeutic class | Target |
asvasiran-sodium | ALN-RSV01 | n/a | RNAi | therapeutic | n/a |
beclabuvir | BMS-791325 | ZTTKEBYSXUCBSE-VSBZUFFNSA-N | synthetic small molecule | therapeutic | HCV NS5B polymerase |
benzhydrocodone, benzhydrocodone-hydrochloride |
KP-201
| VPMRSLWWUXNYRY-PJCFOSJUSA-N | natural product derived small molecule | therapeutic | Opioid receptors |
bradanicline, bradanicline-hydrochloride | TC-5619 | OXKRFEWMSWPKKV-GHTZIAJQSA-N | synthetic small molecule | therapeutic | alpha-7 nicotinic acetylcholine receptor |
briciclib, briciclib-sodium | ON-014185 | LXENKEWVEVKKGV-BQYQJAHWSA-N | synthetic small molecule | therapeutic | n/a |
ceritinib | NVP-LDK378-NX | VERWOWGGCGHDQE-UHFFFAOYSA-N | synthetic small molecule | therapeutic | ALK |
dasabuvir | ABT-333 | NBRBXGKOEOGLOI-UHFFFAOYSA-N | synthetic small molecule | therapeutic | HCV NS5B polymerase |
defactinib, defactinib-hydrochloride | VS-6063 | FWLMVFUGMHIOAA-UHFFFAOYSA-N | synthetic small molecule | therapeutic | FAK |
dianhydrogalactitol | VAL-083, NSC-1323313 | AAFJXZWCNVJTMK-UHFFFAOYSA-N | synthetic small molecule | therapeutic | DNA |
dinutuximab |
n/a
| n/a | monoclonal antibody | therapeutic | GD2 |
diridavumab | CR-6261 | n/a | monoclonal antibody | therapeutic | haemagglutinin |
encenicline, encenicline-hydrochloride | EVP-6124 | SSRDSYXGYPJKRR-ZDUSSCGKSA-N | synthetic small molecule | therapeutic | alpha-7 nicotinic acetylcholine receptor |
esuberaprost, esuberaprost-sodium | APS-314d, BPS-314d | CTPOHARTNNSRSR-NOQAJONNSA-N | synthetic small molecule | therapeutic | IP1 receptor |
filociclovir | MBX-400 | KMUNHOKTIVSFRA-KXFIGUGUSA-N | synthetic small molecule | therapeutic | CMV DNA polymerase |
fosdagrocorat | PF-04171327 | n/a | synthetic small molecule | therapeutic | GR |
gedatolisib | PF-05212384, PKI-587 | DWZAEMINVBZMHQ-UHFFFAOYSA-N | synthetic small molecule | therapeutic | PI3K & mTOR |
glasdegib | PF-04449913 | SFNSLLSYNZWZQG-VQIMIIECSA-N | synthetic small molecule | therapeutic | smoothened |
indoximod |
D-1MT
| n/a | natural product derived small molecule | therapeutic | IDO |
latiglutenase | ALV-003 | n/a | enzyme | therapeutic | n/a |
lulizumab-pegol | BMS-931699 | n/a | monoclonal antibody | therapeutic | CD28 |
ombitasvir | ABT-267 | PIDFDZJZLOTZTM-KHVQSSSXSA-N | synthetic small molecule | therapeutic | HCV NS5a |
omega-3-carboxylic-acids |
n/a
| n/a | natural product derived small molecule | therapeutic | n/a |
peficitinib | ASP-015K | DREIJXJRTLTGJC-UHFFFAOYSA-N | synthetic small molecule | therapeutic | JAK |
pegargiminase |
n/a
| n/a | enzyme | therapeutic | n/a |
pembrolizumab |
n/a
| n/a | monoclonal antibody | therapeutic | Programmed cell death 1 (PDCD1) |
polmacoxib | CG-100649 | IJWPAFMIFNSIGD-UHFFFAOYSA-N | synthetic small molecule | therapeutic | COX-2 |
sarolaner | PF-6450567 | FLEFKKUZMDEUIP-QFIPXVFZSA-N | synthetic small molecule | therapeutic | n/a |
transcrocetinate-sodium |
n/a
| n/a | natural product derived small molecule | radiation sensitizer | n/a |
uprosertib | GSK-2141795C | AXTAPYRUEKNRBA-JTQLQIEISA-N | synthetic small molecule | therapeutic | AKT1 |
venetoclax | ABT-199 | LQBVNQSMGBZMKD-UHFFFAOYSA-N | synthetic small molecule | therapeutic | BCL-2 |
-
Webinar on Drug Targets - 27th March 2014
I'm giving a webinar on Drug Targets and Drug Targeting at 2-3 pm EDT on Thursday March 27th 2014. Please note that Europe and the US have not aligned their saving times then, so the time difference will be 4 hours for the UK and Portugal (6pm GMT/WET), and 5 hours for the most of the rest of western/central Europe (7pm CET) and 6 hours for Finland and Eastern Europe (8pm EET). I plan to cover quite a lot of ground, with quite a lot of new stuff and analyses.Registration is free on this link http://acswebinars.org/drug-discovery, and the slides will be available after the meeting on this site too. Well done ACS!!The next in the series, on lead discovery on Thursday 24th April, is by our great friend and collaborator Tudor Oprea, so put that in your diary too.jpo
-
Unpacking a GPU computation server...Leviathan unleashed
What / why?
As you might know, EMBL-EBI has a very powerful cluster. Yet some time ago we were running into some limitations and were pondering on how great it would be if we had the ability to run more concurrent threads in a single machine (avoiding the bottleneck that inevitably appears on the network for some jobs).
It turns out there is an answer, namely in the form of a GPU (graphics processing unit). This is the same type of chip that creates 3D graphics for games in your home PC / laptop. While the capabilities of individual calculation cores are relatively limited on GPUs compared to CPUs, they can have a massive amount of them in order to generate 3D environments at the speeds of 60 frames per second. Schematically it looks like this (CPU left, GPU right):
As you can see, the CPU can handle 8 threads concurrently, whereas the GPU can handle 2880 (see also this great youtube video by the myth busters). We have all kinds of ideas of calculations we want to run on the GPUs (that have shown to work well in MD), but now first ... the geek tradition that is unboxing!NvidiaThe guys at Nvidia were very generous and provided us with 5 GPUs (thanks to Mark Berger and Timothy Lanfaer). Tim was also very quick with technical questions concerning the hardware specs needed and software troubleshooting. Thanks again!UnpackingAt the EMBL-EBI people typically work with laptops or thin clients, and the cluster consists of blades so there was no place to put our GPUs. Yet, after a quick investigation we had a list of hardware we wanted and a big box was delivered two weeks ago !Time to unpack...So after opening and removing the hardware, we had a tower / 4u rackmountable chasisNext up, placement of the GPUs inside the chassis:Some tinkering was in order:And finally we could boot and install the OS. We choose Ubuntu 12.04 LTS because of the stability, and availability many packages (with source code).Leviathan?Just one question remains, why 'Leviathan'?Given the availability of python based cuda packages, we will probably start there. Hence our server we be a very powerful incarnation of python, and what's more awe-inspiring than the Leviathan?CUDA runningAfter some trouble getting the drivers to work (we use Ubuntu 12.04 LTS), Michal got everything up and running!Potential projectsSome of the projects we will be starting with are CUDA based random forests, similarity matrix calculations, and compound clustering. If you have a good idea and would like to collaborate and co-publish, please contact us via email!SpecsThe server contains the following hardware:Case: Supermicro GPU tower/4U serverPSU: 1,620W Redundant PSUCPUs: 2*Intel Xeon E5-2603 1.8GHz 4coreRAM: 8*8GB Reg ECC DDR3 1600MHzDisk: 1*2TB 3.5” SATA HDDGPUs: 1*Tesla K40; 2*Tesla K20 (one extra to be added later)Michal & Gerard -
ChEMBL and Handling of Retracted Papers
There is much attention paid to retracted data and errors in the literature, and also to resources that use the literature to build knowledge on top of published papers (for example ChEMBL). Sometimes there is a deliberate intent to deceive, and other times an accident in data processing and interpretation. The Retraction Watch blog is great reading on a long train journey if you want to see some of the pre-formal retraction discussion. With advances in text mining (in the broadest sense, so including images as text, etc.) and secondly with more publications becoming Open Access, it is easier to find and flag these errors; for example see some of the pioneering work and ideas of Peter Murray-Rust. We find errors and inconsistencies in the literature really frequently - units that don't make sense, an end point inconsistent with the reported assay, etc. We either fix, or flag these sort of inconsistencies when we curate data. In general, science is pretty robust to these errors, and most errors, to be frank, have little impact in many/most realistic applications of the data (and consequently on literature derived datasources). What we don't do is contact the editors of the journal or the original authors - and maybe this is something we should start doing.
Given that we now are running SureChEMBL, which is completely automated in it's operation, we are thinking carefully about errors, and how to flag or mark them in some way (for example in the accuracy of text extracted chemical structures) - I think research into the processing and filtering of such 'big data' is going to be a very active and important field in the near future - and is core to reproducibility of analyses. I've looked a little at some cases where ChEMBL structures are the odd one out compared to other public chemistry resources - sometimes we've been wrong, and by comparison with other sources, we've then fixed things - sometimes though, we're right and the rest of the community is 'wrong'. For me this is the way that resources like ChEMBL improve, by verifying the data we hold in whatever reasonable way is available to us. Using simple consensus or voting approaches to validate proof is often right, and often wrong - the most insidious case is where wrong data is propagated without provenance, and this is especially problematic in integration resources which merge data from many sources. I have a draft blog post on some of the analyses I've done, but this currently unfinished work, but will contact the other data providers first to feedback the differences.
There is one particular type of error though that can be captured semi-automatically, and then included - formal retractions of papers. The PubMed search above (in the picture) shows the retractions recorded for J. Med. Chem. - a small number you'd agree, it's then straightforward to identify the source papers and flag these in some way. Based on what I've found so far, the issue with the literature extracted in ChEMBL is very minor, but still important if you are basing work on analyses that rely on these particular data.
We're still deciding what to do in ChEMBL, but when we've settled on exactly what to do, we'll process the data to correct for these retractions and corrections.
I must acknowledge the fantastic Laura Furlong at IMIM for help with this problem - Laura responded to a twitter post I wrote asking the question of how to link retractions with original papers - so social media does work in science. -
Software that phones home: Good or bad?
This is something that's been bugging me for a few days now - probably just triggered by reading all the recent disclosures of NSA/GCHQ surveillance, and trust in software systems in general. The basic issue I'm thinking about is when and where is it 'right' for software to 'phone home'?
This checking in with base idea is sometimes a good thing - for example if when I fire up a program, I get a little box that tells me a new version is available then that's a good thing. Or if my computer or phone is stolen, then calling in to let me know where in the world it is, is a good thing. It is probably also a good thing if you are a software vendor, and you want to ensure that your software hasn't been pirated, or run outside of the parameters for which it is properly licensed. For the latter case, it may even be a good idea to encrypt the message pinged back, to prevent l33t hax0rs suppressing license compliance mechanisms.
But the privacy issues of this sort of thing are very big, especially if, you as a user don't know it's being done, or if you don't know what is being sent back to base. I have only ever come across one software license (in this case a commercial vendor) that discusses this (in the context of the licensee not suppressing in any way this communication as a way of ensuring license compliance - not addressing at all what is sent back - if it's my source IP address and a timestamp fine, if it is a dump of all my queries, I'd be furious).
Of course, it's possible to control or spot this sort of activity, and I've just installed Radio Silence as a quick way of seeing if any of my desktop apps do anything behind the scenes I don't know about.
But, in general, are there any community expectations and standards for this sort of thing, especially for cases where the software will be used explicitly to generate trade secrets and perform confidential research? -
USANS - February 2014
Just catching up on some recently published USAN statements.
-
New Drug Approvals 2013 - Pt. XXIV - Sofosbuvir (Sovaldi ™)
ATC code (stem): J05AB
Simeprevir for the treatment of this condition.
Hepatitis C is an infectious disease that affects primarily the liver and is caused by the hepatitis C virus (HCV), which belongs to the family of Flaviviridae and has a positive sense single stranded RNA genome of 9,600 nucleotides. Infection is mainly by blood-to-blood contact, through sharing or reuse of syringes or unsterilized medical equipment. Initially, the infection progresses without symptoms, and only becomes apparent in the chronic stages when liver damage leads to symptoms such as bleeding, jaundice, liver cancer and hepatic encephalopathy.
Sofosbuvir is a nucleotide analog inhibitor of the viral RNA polymerase (NS5b, Uniprot genome polyprotein: P26664, 2421-3011, PDB 3hkw). Viral RNA polymerases differ significantly from eukaryotic and bacterial polymerases both in sequence and three-dimensional structure. Thus, sofosbuvir inhibits only the amplification of the viral RNA genome and not endogenous transcription in the host organism by entering the polymerase as a substrate and terminating the transcript chain. The IC50 measured against NS5b ranged between 0.7 and 2.6 micro-molar, depending on the genotpye of the HCV isolate.
Structure of HCV NS5b, genotype 1a generated in pymol from PDB 3hkw.
Canonical SMILES: CC(C)OC(=O)[C@H](C)N[P@](=O)(OC[C@H]1O[C@@H](N2C=CC(=O)NC2=O)[C@](C)(F)[C@@H]1O)Oc3ccccc3
Std-InChI: InChI=1S/C22H29FN3O9P/c1-13(2)33-19(29)14(3)25-36(31,35-15-8-6-5-7-9-15)32-12-16-18(28)22(4,23)20(34-16)26-11-10-17(27)24-21(26)30/h5-11,13-14,16,18,20,28H,12H2,1-4H3,(H,25,31)(H,24,27,30)/t14-,16+,18+,20+,22+,36-/m0/s1
Std InChI key: TTZHDVOVKQGIBA-IQWMDFIBSA-N
Sofosbuvir is an off-white crystalline substance that is slightly soluble in water. The molecular weight and logP are 529.45 Da and 0.92, respectively. Note the relatively low logP charateristic of nucleotide analog compounds.
The recommended daily dose of sofosbuvir is 400mg in a single tablet. Peak plasma concentration of the active metabolite are reached after 30-120 minutes post administration. The clearance is primarily through the kidney, with a half-life of 0.4 hours for sofosbuvir and 27 hours for its metabolite. Sofosbuvir is a substrate of P-gp, and therefore inducers of P-gp, such as rifampicin and St John's wort are contraindicated for use with sofosbuvir.
Reported side effects of sofosbuvir include fatigue, headache, nausea, insomnia and anemia.
Sofosbuvir is marketed by Gilead under the name Sovaldi.
References:
[1] Murakami E, Tolstykh T, Bao H, Niu C, Steuer HMM, Bao D, Chang W, Espiritu C, Bansal S, Lam AM, Otto MJ, Sofia MJ, Furman P a: Mechanism of activation of PSI-7851 and its diastereoisomer PSI-7977. J. Biol. Chem. 2010, 285:34337–47. -
HELM in ChEMBL
While the vast majority of molecules in ChEMBL are small molecules, we also have a growing collection of peptide-derived compounds, monoclonal antibodies and other biotherapeutic drugs in the database. Historically, these molecules have been represented by molfiles (for small-medium peptides) or protein sequences (for monoclonal antibodies) in the database.
However, for many biotherapeutics, these formats are not sufficient to represent the complexities of the molecules. Molfiles and other chemical structure formats are impractical for large molecules, and simple protein sequences cannot adequately capture the non-natural amino acids and other modifications that are commonplace in biotherapeutic drugs.
We are therefore working to adopt the HELM (Hierarchical Editing Language for Macromolecules) standard, developed by Pfizer and the Pistoia Alliance, within ChEMBL and plan to include HELM notation for all peptide-derived drugs and compounds in release 20 of the database.
See also the recent press-release for more information. -
#ICanHazStructurez
I needed a pdf for a presentation I was giving this morning, I was in a hotel, which doesn't have an institutional subscription, so was stuck. On twitter, there is a hashtag #ICanHazPDF, which is quite successful, but to be clear I didn't use that route ;{o .
It got me thinking though, given the reach and immediacy of twitter, could I use it to get chemical structures - so #ICanHazStructurez was born. It worked, well in fact, and was very quick (see the image above - remember this was 6 am in the morning (in Germany at least).
So a big high five to @Lewis_Lab and @nickholway - FF and all that!