Innovation and Ownership in Drug Discovery by Country (maybe, perhaps, well maybe not then!)

I've been looking at the Research Code data recently, and here is an interesting plot. It is the counts of Research Codes classified by Country. It is a first, look-see plot, based on currently incomplete data, but I think it is quite interesting nonetheless.


A basic assumption behind the assignment of a distinct research code stem is that they reflect an autonomous entity with the aim of discovering drugs. Today the majority of newly founded entities will be funded by private/VC money, and these will be acquired by a larger company once some degree of commercial success, or anticipated commercial potential has been achieved. Our data is a 'blend' of recent and historical data, and over time, the structure and scale of research has changed (a smaller number of companies in the distant past, and a larger number from the mid 1990s onwards as a large number of biotechs were established; also there will be differences across various countries).

The way we have collected the research codes (773 of them so far) will focus on clinical stage compounds, and therefore the ability of that company and associated infrastructure to move compounds through into clinical development. In our tables the research code has a 'currently controlling company' assigned to it, and this company has a 'country' assigned to it - this is the location of its corporate headquarters, and to a first approximation will record where the controlling rights/IP is now held (ignoring any specific licensing deals that have been done over specific drugs). Of course, the location of the headquarters does not reflect where the work is, or has been historically, done. Many current companies have multiple research codes, for example Pfizer has 32 distinct historical research codes, and this count will correlate with a number of mergers and acquisitions over time; these mergers will sometimes switch 'ownership' from one country to another.

The distribution follows a classic power-law distribution (80:20 rule, or a whole bunch of other similar names) -  specifically, six countries (of 27) cover 86% of research code stems (the USA, Japan, Germany, France, the UK and Switzerland). To my mind there are a few surprises; for example, the relatively high rank of Japan - this may reflect a complex corporate history of mergers, there are certainly few biotechs in Japan producing clinical candidates; but I just don't know yet. Secondly, Sweden seems lower than I would have expected, but this may be down to mergers transferring 'corporate ownership' from one country to another (Astra and Pharmacia). Conversely, Italy seems higher than I would have initially predicted - but maybe I don't know the history of the industry as well as I should.

Another obvious feature is the low current rank of India and China - although a lot of basic research and outsourcing is done in these territories now, very little of this is currently owned and coordinated by companies headquartered there.

I've given up on trying to use google docs for any of this stuff - it is not that stable for me, and so if anyone is interested in the underlying spreadsheet, mail me....