The inside scoop on Cochrane Crowd

From little things, big things grow. Cochrane Crowd, Cochrane’s citizen science platform, now has 7500+ contributors who have notched over 1.5 million classifications. Here Anna Noel-Storr, Co-Lead of Cochrane Crowd, shares the story of Cochrane Crowd and how the platform may evolve in the future.

It often takes a combination of factors, including hard work and to some extent good fortune, to get a big project off the ground. Can you tell us a bit about how Cochrane Crowd came to be?

Good fortune, and the talent, enthusiasm, and vision of many individuals along the way, has got to us to where we are now. It’s quite hard to pinpoint when it all began. Many will remember the Embase Project that came before Cochrane Crowd. That work was instrumental, but actually there were events leading up to that project that rank as significant.

The first such moment was when my then boss, Rupert, introduced me to two medical students with the hope that I could find them something to do. We set them to work helping to extract some information about trials in the area of dementia (that was, and indeed still is, an area that I work in as an information specialist). They did a great job. Then an opportunity for some funding came along and again my boss popped his head round the door and wondered if this was a chance to see if we could scale this approach. It was very much feasibility work in those days but that grant was a huge learning experience for all involved. The project manager we recruited, Caroline, became a real driving force behind that early work. She was quite an inspiration to me.

I think that work helped demonstrate that:

people often want to help but flexible opportunities are limited;
the tasks you provide have to be doable and needed;
and technology can be a huge enabler.

We’ve gone on to evolve and scale, first with the Trial Blazers study and then with the Embase Project.

And now Cochrane Crowd is 18 months old. It has been an incredible journey so far, and what’s exciting is that it is in no way over yet.

To the outsider, what was the platform like right at the start? What functionality did it have?

The early version used for the Embase Project was functional and easy to use. This was a critical factor in the project’s success. Much of that functionality still exists in Cochrane Crowd, but with Cochrane Crowd we’ve also added so much more.

In the early days you couldn’t prioritise the records you worked on (for example now you can prioritise by healthcare area such as dementia, or child health), and you couldn’t view your decisions against the final decision made on a record very easily. You also only had one task to choose from. Now there are three mainstream tasks, and very soon, there will be five!

Of course, I owe the project’s technical lead, Gordon, huge credit here. He works tirelessly to make the platform work and brings his amazing problem-solving brain to the many conundrums we have to solve.

There is an entire underbelly to Cochrane Crowd that not many people see. Can you tell us a little about that?

You’re right; there is a lot going on behind the scenes. The main thing is the ‘agreement algorithm’ that helps to ensure that the collective decision-making process works well. What this means is that for each task we need to work out how many decisions are needed for each record, as well as the configuration or ordering of those decisions to help make sure that the records end up with the correct classification. It sounds quite geeky, and I suppose it is, but it’s needed to ensure accuracy and efficiency.

To work out if we’ve got the algorithm right we run various evaluations: some are formal evaluations where we take a whole load of records and send them to the Crowd, and independently send them to experts so that we can compare the final classifications from each group. We’re about to do that with our latest task: CT identification. Even though this task is very similar to the RCT identification task, the records are different enough to warrant an evaluation. Based on our findings, we’ll either continue with the algorithm or tweak it and try again. Other evaluative activities are less formal and include things like random spot checks on records.

Cochrane Crowd is a citizen science platform, and yet there are some differences between Crowd and most other platforms in this genre. Can you tell us about that?

Cochrane Crowd is a bit different. The tasks on the whole are quite text-based. That does immediately make it more challenging in terms of wide appeal. I think I’d rather be out in the fresh counting bees, than reading a badly written abstract (oh dear, I’m not really selling the task am I?!). My point is, that when you have tasks like ours, you have to make everything else around the tasks as appealing as possible. For example, we are putting quite a bit of focus into the feedback we can provide contributors. Many citizen science projects don’t provide individuals with feedback related to performance. It is challenging to do this in a live system and I don’t think we’ve quite got it right yet, but it is something we’re very aware people want and we have some exciting new features coming very soon.

What kind of impact is Cochrane Crowd having?

Another tough question! Can’t you ask me something easier like how many cats do I have? [Ed.: Nope, sorry! Readers, visit Anna’s Twitter account for some all-important cat photos!] Ultimately, we want to achieve two things with Cochrane Crowd: first, we want to help in the effort to produce good evidence quickly, and second, we want to provide people with opportunities to be a part of that effort.

In our effort to help produce good evidence quickly, we are making an impact in several ways: we’re identifying thousands of reports of randomised trials for CENTRAL, helping to enrich that critical resource; we’ve helped develop machine learning classifiers that can now do over 70% of the work we humans were previously doing, meaning that we can focus our effort on finding RCTs from other sources; we are making good inroads into doing the same for other types of studies, such as diagnostic test accuracy (DTA) studies; and we’re creating new tasks aimed at describing health research in a consistent way which I feel confident will go a long way towards enhancing trial discovery in ways that are both fast and reliable.

In terms of providing people with the opportunity to contribute, I hope we are doing that too. Thousands of people have signed up, and the rate of sign up is increasing. However, it’s not just about sign up. We want people to feel they really can, and are, helping. Part of this comes down to offering a good range of tasks and making it easy to dive into those tasks; and part of it is about what we can give back in terms of rewards. Cochrane’s new membership scheme will certainly help here. In addition, we’re developing some new training materials that I hope will give those who want to know more, a chance to build up their knowledge and skills in understanding health evidence.

What do you foresee for Cochrane Crowd in the future?

I hope Cochrane Crowd will play an increasing role in the efficient identification, management and production of health evidence. I’d like to make it truly possible for anyone with an interest in health to be a part of this endeavour. We live in quite unsettling times. Whilst it has always been pretty easy to make bogus, non-evidence based claims about anything, it’s never been so easy to reach so many potential listeners or readers. That means that it’s more important than ever to counter those flimsy claims by challenging them and fighting back with evidence.

Systems like Cochrane Crowd give us a chance of keeping up with the information overload meaning that we can become more responsive to questions and claims as they arise, and quicker to answer them. What’s more, we need to stop starting again with every question we have, and instead make far better use of the intelligence that has already been generated.

I want Cochrane Crowd itself to be able to change and adapt as information needs inevitably change. This includes being able to create and scale new tasks quickly. I think all tasks on Cochrane Crowd will have a shelf life. In fact they should have. If we’re still struggling to identify RCTs in a few years from now, then we’re not doing enough to solve the underlying problems (such as poor reporting). Let’s face it, crowdsourcing RCT identification is a clever work-around for a problem that shouldn’t even exist anymore.

Looking back over these eventful 18 months with Crowd, what are the highlights for you? What have been the main challenges?

The last 18 months have seen many high points. The MedLitBlitz, our one-year birthday celebration where we teamed up with the wonderful Mark2Cure is certainly one. The pilot work on screening for individual reviews has been exciting with the results exceeding my expectations both in terms of uptake and quality. But probably the main, slightly unexpected outcome has been the work we’ve done with the machine-learning team. When we first started out, our aim was simply to identify reports of randomised trials for CENTRAL. Now, thanks to the Crowd, we’ve been able to build machine classifiers that can do much of the task. Human effort will always be needed but it should not be used on tasks that can be done in a fraction of the time by automation. [Ed.: I feel another blog coming on!]

Of course there have been challenging times too and many occasions when I’ve lain awake at 3am trying to figure out how to solve some issue or other. As the platform grows, both in size and role, it becomes increasingly important that it interacts and is integrated properly with other systems and processes, many of which are also being developed. We can’t, and we shouldn’t, operate in our own little silo anymore but system integration and development of workflows when one or more of those systems is still evolving, is not easy.

The biggest challenge of all though is time. We have so much we want to do. These first 18 months have been extremely busy; the next 12 months will be even busier.

And finally, how much tea do you actually drink?

Haha! Too much, but now you’ve put that thought in my head, I’m off to make another one! I think I’ve earned it after all those questions.

Sign up to Cochrane Crowd, follow us on Twitter, and contact us at crowd@cochrane.org.

Compiled by Emily Steele, Cochrane Crowd Community Engagement & Partnerships Manager

Support for Project Transform was provided by Cochrane and the National Health and Medical Research Council of Australia (APP1114605). The contents of the published material are solely the responsibility of the Administering Institution, a Participating Institution or individual authors and do not reflect the views of the NHMRC.

13 December 2017