Crowdsourcing for scientific tasks


I was at a great meeting this week, and there was a discussion of the potential use of crowdsourcing to solve scientific problems, more specifically to generate good ideas for consideration by a panel. There are many successful examples of where many 'different pairs of eyes', 'wisdom of the crowd', 'diverse viewpoints and cultures', etc can solve problems, efficiently. A nice example that I am free to talk about is the recent Sequence Squeeze competition organised by the Pistoia Alliance (google will find some links, I'm under domestic pressure to be quick with this post and make the morning coffee!).

However, I was struck at just how complex this is to implement in a field where intellectual property (IP) is one of the likely outcomes, and potentially even more so when a prize of some sort is involved. Some of the risks are:

  • Contractual restrictions of participants (for all people, academics, industrialists, etc) - my contract has clear restrictions of what I can and can't do, unpaid or not, for someone else - these are fair I think, and are a trade I have made in order to receive a regular salary in return for doing a job I love. There is a difference of course if it is outside of your professional field and that of the crow sourced problem - coming up with ideas for new fishing hooks for me, would not likely cause problems, whereas IE approaches for biological data mining most certainly would. Employment contracts are different, but if you are interested in participating in anything like this, I would 1) Check your contract, and 2) Tell your employer, and get permission (written) that what you are doing is fine and approved.
  • Restrictions on the literature/software/database licenses - it is likely that for crowdsourced scientific problems, you will need access to 'licensed' materials (your library, software and maybe databases) will be used and/or required. The terms of these licenses are often quite complex, and you may not even know what they are (as an admission, I have never checked out the terms of use of some of the library material we have; however, I know what is typical, and I would like to think of myself as sensible and cautious on these matters). However, it is the case, that typically they do not allow giving stuff away freely to third parties, or even application on 'commercial' activities. So check the terms of any licenses for software/websites/journal access, etc out, and make a copy of this, keep it with your personal records.
  • Risks of accidental or deliberate IP poisoning of the solution - this is pretty significant and potentially very costly and time wasting - imagine you have a participant that is either malicious, or unthinking (the most dangerous is a combination of both features), and gives you some of their employers IP or trade secrets that are useful in solving the challenge. This is a big problem - what if you start work on the basis of this fantastic idea and contribution, and then are served with a cease and desist letter - what are the liabilities. It's actually quite difficult to protect yourself from this, a simple waiver that says that all submissions are given freely and licensed without restriction is simply not good enough. Seriously. Related to this, I would also be really wary where there is the ability or desire to have anonymity on submission.

So, the likely outcome for well organised crowdsourced things is that the level of legal documentation and protection required for both organisers and participants is likely to be very significant. If anyone knows of any well constructed systems that are fair and balanced in risks to both participants and organisers please let me know.

I'm sure there's many more issues I haven't thought of, so post away in the comments if you want.