SARfari is an integration platform built on top of our
databases (StARlite, CandiStore and DrugStore). It also includes an open architecture for the loading of proprietary (or other third party data). This loading is performed against a series of local 'business rules' that define chemical structure representation, target mapping, assay units,
etc. The SARfaris are written in the
Catalyst MVC framework (so essentially a structured perl), and also
Apache for the application server. The original idea was to mirror in an informatics system a 'platform' view of drug discovery - in this case integrated data within a gene family of interest (but it could be around a metabolome-based view,
e.g. adenine binding proteins, an entire genome,
etc.. Our first foray into the area was in 2004 with a GPCR system built with
SQLite, however, this was not very extensible, and we rapidly reached the bounds that were technologically possible. We then built a
protein kinase version, and then a
rhodopsin-like GPCR SARfari under Oracle, and included the ability to load 'local' data. SARfari is quite neat in that it integrates SAR, sequence alignment, binding site, and 3-D structure data all into a single, simple portal; with, of course, a focus on drug discovery processes. One additional thing we did here was to process patent crystal structures (which often never make it into the
RCSB PDB) into a usable form, and then loading these into a version of SARfari.
Our future plans include building a generic SARfari builder (so you will be able to paste a sequence of an arbitrary target into a web page, and then, after a short while, fully integrated and federated data will be delivered, as a new stand-alone gene-family themed web application).
At the EMBL-EBI we will host a copy of kinase SARfari and GPCR SARfari, populated with the relevant public-domain data from our own databases, the software systems (including source code) will be downloadable in toto and gratis, and installable locally for loading of local lab data (we do not plan to allow upload of data onto the EMBL-EBI SARfaris). At the moment, SARfari requires Oracle 9, and the Symyx chemical data cartridge, but future development will be directed towards a more generic and Open Source solution, including the CDK. If anyone would like to try the existing SARfari systems in advance, please feel free to contact us now.
The same software infrastructure and look-and-feel will be used for the DrugEBIlity project at the EMBL-EBI.