Help Researchers Sift Through Their Data with Zooniverse
Name: Zooniverse (Visit Zooniverse)
Type: Crowdsourced Research
Best Website For: Citizen Science Projects
Reason it's on The Best Sites:
Zooniverse is a platform that lets internet users participate in academic research by having the user analyze data. The site features a slick UI and gives researchers a way of crowdsourcing their data analysis.
The Zooniverse Mobile App provides push notification updates about our projects, as well as access to our most mobile-friendly projects and publications.
The Zooniverse Mobile App provides push notification updates about our projects, as well as access to our most mobile-friendly projects and publications.
Read on to find out about the findings of this project, made possible by the efforts of our fantastic citizen scientist volunteers. If you would like to learn more, you can access a preprint publication about this work here.
With thanks to Zooniverse Volunteer Becky Kennard for editing this piece.
Microscopy Masters draws to a close
Microscopy Masters, the cryo-electron microscopy (cryo-EM) project building complex 3D models of proteins, is approaching its initial conclusion. Over the two years the project has been running, we’ve collected over 17,000 classifications and built a dataset of 209,696 unique protein particles. The primary dataset used in this project was the 26S proteasome lid complex generated by the Lander Lab at the Scripps Research Institute. We have also annotated other, smaller datasets.
The proteasome is a large multi-protein complex responsible for breaking down unwanted proteins into reusable parts, kind of like a large recycling center for the cell. Studying its structure could reveal the mechanisms behind how the proteasome lid only opens for proteins marked for recycling, and give insights into problems caused by the lid malfunctioning.
This project is centered on an important tenet in biology, ‘form follows function.’ On the molecular level of biology, what this means is that the shapes of large biological molecules, such as proteins and nucleic acids, are evolved to perform specific functions. By studying and understanding the structure of biological complexes, researchers can better understand how all the little moving parts of life interact, which will allow them to better combat diseases and disorders.
Scientists are often too busy in the lab to come up with catchy names, meaning that the techniques they invent are usually given pretty self-explanatory titles. In the case of cryo-EM, everything you need to know is in the name.
Imagine some scientists are interested in studying a protein, say our subject, the 26S proteasome lid. Cells containing large quantities of the protein are lysed (a scientific way to say ‘popped like balloons’) and the contents of the cells are put into a solution. That solution is purified so that it only contains the protein the scientists are interested in. The purified solution is then flash-frozen in extremely thin ice (cryo) and put under an electron microscope (EM) to obtain images of the proteins. These images are then put through sophisticated reconstruction software to obtain a detailed 3D model of the protein. This technique is so powerful, scientists can identify individual atoms in the protein complex, giving them deep insights into how it interacts with its environment.
Figure 1 A schematic of how cryo-EM is done. Taken from dx.doi.org/10.1038/nature19948.
Of course, there have been years of work involved in the cryo-EM that I just explained in four sentences (and some very, very expensive microscopes!). A particularly time-consuming task for cryo-microscopists is picking the individual proteins from the microscopy images (micrographs), called ‘particle picking.’
Scientists used to do this by hand, but since they often have thousands of these images to process, this can take weeks of work. So, they usually rely on software to extract the protein images. But because some proteins are so complex, it can be difficult for software to identify them in the noisy micrographs. For this reason, we decided to train citizen scientists to pick the particles from our proteasome lid data and see if it could be used to build a detailed molecular model.
Figure 2 On the left is a blank micrograph, on the right is a micrograph that has had the proteins manually picked.
Using the data from our volunteers, we made a full 3D reconstruction of the proteasome lid. We compared the model to one made with an automatic particle picker, both of which are shown below. Although they look very similar, what matters for microscopists is the ‘resolution’ of the reconstructions. What resolution means in this context is how consistent the models are when used several times.
In this case each dataset was divided into two random halves and made into two separate models, which were then compared to determine a resolution. Even though the resolution for the computer-made model is lower (better) in this case, this is partly due to the fact that the computer-picked dataset had so many more particles. For this reason, we also did a reconstruction using a subset of the computer-picked data with the same amount of particles as the crowdsourced dataset. This brought the resolution to 4.036 Å, closer to the crowdsourced dataset but still lower.
Figure 3 The final reconstructions of the crowdsourced and computational datasets. The resolutions for each are listed, the lower the better.
Even with the higher resolution, we believe this is a fantastic example of the power of citizen science. We built an entirely hand-picked dataset from people who had little to no experience with cryo-EM. This dataset allowed us to build a detailed, 3D model of a complex protein that had a similar resolution to the one built by a team of trained scientists with state-of-the-art software.
This was the first time we have run a project of this nature, and we believe that with tweaking and better feedback systems (which were only implemented by the fantastic Zooniverse team late into our project’s run) we can process data better and faster than we did in our first run.
As a side experiment to try and figure out how to better engage users, some of our project’s participants might remember being sent newsletters about ‘sprint’ datasets, which were small datasets of 15-20 images of other proteins. The use for these was not to build an entire particle dataset, but to provide data for the researchers to feed their automatic particle picking software. We found that giving the images different color schemes than the traditional black-and-white was a nice way to ‘spice’ the micrographs up for users, and we were able to provide researchers with usable data in a matter of days that they could use to start their data processing.
Although we currently have put the Microscopy Masters project on hold, we are excited with the results and in the process of submitting a publication of our initial results. I would like to thank everybody involved with the Zooniverse for building a fantastic platform to try our project. In particular, I would like to thank the Zooniverse team for answering my questions and helping me get Microscopy Masters up and working.
And lastly, thank you to all the hard-working participants in Microscopy Masters and everybody who participates in this great website!
Find out more:
You can check more of the great work done here at the Su lab on our website.
The Lander Lab is consistently pushing the work being done in cryo-EM, go see the work they are doing at their homepage.
Our publication can be previewed as a preprint on bioRxiv.
Dr Sam Illingworth, Senior Lecturer in Science Communication and Poet, wrote the following Zooniverse inspired poem for us; ‘Research for All’.
If you’d like to read more of his work, check out Sam’s blog here.
Research for All
Detecting bubbles in the Milky Way,
Or sorting a muon and gamma ray;
Identifying planets and their stars,
Then codifying ice geysers on Mars.
From mapping out old weather lost at sea,
To counting jungle rhythms in a tree;
With floating forests hiding in plain sight,
Sometimes research just needs a brighter light.
Etching a cell to analyse their state,
And bashing bugs to keep drugs up-to-date;
The history of what has gone before,
Can help predict what science has in store.
Transcribing ancient texts and works of art,
Unearthing words that set Shakespeare apart;
Revealing secret lives and hidden gist,
By searching for what others might have missed.
In answering the questions left to find,
We need the help of more than just one mind;
A Universe of projects yet to do,
The door is open, step into the zoo.
Following on from the release of Panoptes Client 1.0 for Python, we’ve just released version 1.0 of the Panoptes CLI. This is a command-line client for managing your projects, because some things are just easier in a terminal! The CLI lets you do common project management tasks, such as activating workflows, linking subject sets, downloading data exports, and uploading subjects. Let’s jump in with a few examples.
First, downloading a classification export (obviously you’d insert your own project ID and a filename of your choice):
panoptes project download 764 Downloads/pulsar-hunters-classifications.csv
This command will optionally generate a new export and wait for it to be ready before downloading. No more waiting for the notification email!
New subjects can be uploaded to a new subject set like so (again, inserting your own IDs):
panoptes subject-set create 7 "November 2017 subjects" panoptes subject-set upload-subjects 16401 manifest.csv
You can also pipe the output from the CLI into other standard commands to do more powerful things, such as linking every subject set in your project to a workflow using the xargs command (where 1234 and 5678 are your project ID and workflow ID respectively):
panoptes subject-set ls -q -p 1234 | xargs panoptes workflow add-suject-sets 5678
(Photos courtesy of the Etch A Cell team)
The event was organised by the ConSciCom team who have partnered with the Zooniverse to create two very successful projects – Science Gossip and Orchid Observers. The theme for the evening was to explore the role images, such as illustrations and photographs, have played within natural history and scientific research.
From studying animal behaviour using photos taken by camera traps, to advancing our understanding of cell biology with photos from microscopes, many Zooniverse projects improve our understanding of the world around us through the help of citizen scientist volunteers.
Teams from multiple Zooniverse projects, including BashTheBug, Etch A Cell, Notes from Nature, Orchid Observers, Science Gossip and Seabird Watch, attended the event and spent the evening speaking to people about their projects, and showing how anyone can contribute to real research through citizen science.
(Photos courtesy of the Etch A Cell team and Jim O’Donnell)
Illustrator Dr Makayla Lewis led a live gallery drawing event, asking visitors to pick up a pencil and spend 15 minutes sketching their favourite exhibits.
(Photos courtesy of Jim O’Donnell)
Thanks to everyone who got involved, including Fiona (Penguin Watch), Freddie (University of Oxford), Jim (Zooniverse Developer), Makayla (Illustrator), Martin (Etch A Cell), Nathan (University of Oxford) and Phil (BashTheBug), and especially all our volunteers who attended the event!
I’m happy to announce that the Panoptes Client package for Python has finally reached version 1.0, after nearly a year and a half of development. With this package, you can automate the management of your projects, including uploading subjects, managing subject sets, and downloading data exports.
There’s still more work to do – I have lots of additional features and improvements planned for version 1.1 – but with the release of version 1.0, the Client has a stable set of core features which are useful for managing projects (both large and small).
I know a lot of people have already been using the 0.x versions while we’ve been working on them, so thanks to everyone who submitted feature requests, bug reports, and pull requests on GitHub. Please do upgrade to the latest version to make sure you have the latest bug fixes, and keep the requests and bug reports coming!
Below is a guest blog post from Dr Philip Fowler, lead researcher on our award-winning biomedical research project Bash the Bug. Read on to find out more about this project and how you can get involved!
Our bug-squishing project, BashTheBug, was six months old this month. Since launching on 7th April 2017, over seven thousand Zooniverse volunteers have contributed nearly half a million classifications between them, making 58 classifications per person, on average.
The bugs our volunteers have been bashing are the bacterium responsible for Tuberculosis (TB); ‘Mycobacterium Tuberculosis’. Many people think of TB as a disease of the past, to be found only in the books of Charles Dickens. However, the reality is quite different; TB is now responsible for more deaths each year than HIV/AIDS; in 2015 this disease killed 1.8 million people. To make matters worse, like all other bacterial diseases, TB is evolving resistance to the antibiotics used to treat it. It is this problem that inspired the BashTheBug project, which aims to improve both the diagnosis and treatment of TB.
At the heart of this project is the simple idea that, in order to find out which antibiotics are effective at killing a particular TB strain, we have to try growing that strain in the presence of a range of antibiotics at different doses. If an antibiotic stops the bacterium growing at a dose that can be used safely within the human body, then bingo! that antibiotic can be used to treat that strain. To make doing this simpler, the CRyPTIC project (which is an international consortium of TB research institutions), has designed a 96-well plate which has 14 different anti-TB drugs freeze-dried to the bottom of each well.
Figure 1. A 96-well microtitre plate
These plates are common in science and are about the size of a large mobile phone. When a patient comes into clinic with TB, a sample of the bacterium they are infected with is taken, grown for a couple of weeks and then some is added to each of the 96 wells. The plate is then incubated for two weeks, and then examined to see which wells have TB growing in them and which do not. As each antibiotic is included on the plate at different doses, it is possible to work out the minimum concentration of antibiotic that stops the bug from growing.
But why are we doing this? Well, the genome of each TB sample will also be sequenced. This will allow us to build two large datasets; one of the mutations in the TB genome and another listing which antibiotics work for each sample (and which do not). Using these two datasets, we will then be able to infer which genetic mutations are responsible for resistance to specific antibiotics. With me still? Good. This will give researchers a large and accurate catalogue that would allow anyone to predict which antibiotics would work on any TB infection, simply by sequencing its genome. This is particularly important for the diagnosis and treatment of TB; currently used approaches are notoriously slow, taking up to eight weeks to identify which antibiotics can be used for effective treatment. If you were a clinician would you want to wait two months before starting your patient on treatment? Of course not.
Figure 2. A photograph of M. tuberculosis that has been growing on a plate for two weeks.
You might scoff at this point and say, pah, using genetics like this in hospitals will never happen. Well it already is. Since March 2017, all routine testing for Tuberculosis in England has been done by sequencing the genome of each sample that is sent to either of the two Public Health England reference laboratories. A report is returned to the clinician in around 9 days. Surprisingly, this costs less than the old, traditional methods for TB diagnosis and treatment. Sequencing TB samples also provides other valuable information, for example, you can compare the genomes of different infections to determine if an outbreak is underway, at no extra cost.
So far, so good. The main challenge to this project though, is size. We will be collecting around 100,000 samples from people with TB from around the world between now and 2020. Every single sample will have its genome sequenced and its susceptibility to different antibiotics tested on our 96-well plates. Each of these plates then need to be looked at, and any errors or inconsistencies in how this huge number of 96 well plates are read could lead to false conclusions about which mutations confer resistance, and which don’t.
This problem is why we need your help! You might not be clinical microbiologists (although a few of you no doubt are!) but there are many, many more of you than we have experienced and trained scientists. In fact, each plate will only be looked at by one, maybe two, scientists, and so it is highly likely that, without the help of volunteers, our final dataset will be riven with differences due to how different people in different labs have read the plates. The inconvenient truth, however much we’d like to think otherwise, is staring at a small white circle and deciding whether there is any M. tuberculosis growing or not is a highly subjective task. Take a look at the strip of wells below – the two wells in the top left have no antibiotic at all so give you an idea of how this strain of TB grows normally.
Figure 3. Is there a dose above which the bacteria doesn’t grow?
In the BashTheBug project, you are asked if there is a dose of antibiotic above which the antibiotic doesn’t grow. If you think there is, you are then asked the number of the first well that doesn’t have any TB growing. For the example image above, I might be cautious and say, well, I can see that there appears to be less and less growth as we go to the right and the dosage increases, but it never entirely goes away; there is a very, very faint dot in well #8. So I’m going to say that actually I think there is bacterial growth in all eight wells. You might be optimistic (or even just in a good mood) and disagree with me and say, yes, but by the time you get to well #6, that dot is so small compared to the growth in the control wells, either the antibiotic is doing its job, or, you know what, I’m not convinced that the dot isn’t some sediment or something else entirely.
There is no correct answer. We are probably both right to some extent; there IS something in well #8, but maybe this antibiotic would still be an effective treatment as it would be able to kill enough of the bacteria for your immune system to then be able to kill off the remainder of the infection. Therefore, the aim of BashTheBug is to identify which antibiotic dose multiple people agreed is the dose above which the bacteria no longer grows. Our result from this project is the consensus we get from showing each image to multiple people. Yes, the volunteers might, on average, take a slightly different view to an experienced clinical microbiologist, but that doesn’t matter as they will, on average, be consistent across all the plates which is vital if we are to uncover which genetic mutations confer resistance to antibiotics.
None of this would be possible without the hard work of all our volunteers. So, if you’ve done any classifications, thank you for all your help. Here’s to another six months, many more classifications, and the first results from the hard work done by the many volunteers who have taken part in the project to date.
Find out more:
- Contribute to the project here
- Read the official BashTheBug blog here
- Follow @BashTheBug on Twitter here
- BashTheBug won the Online Community Award of the NIHR Let’s Get Digital Competition, read more here
Check out other coverage of BashTheBug:
- BBC Radio Oxford
- AAAS Science Update
- Global Health Diagnostics
- University of Oxford
- Wellcome Trust
Our inaugural Chicago-area meetup was great fun! Zooniverse volunteers came to the Adler Planetarium, home base for our Chicago team members, to meet some of the Adler Zooniverse web development team and talk to Chicago-area researchers about their Zooniverse projects.
- Zooniverse Highlights and Thank You! (Laura Trouille, co-I for Zooniverse and Senior Director for Citizen Science at the Adler Planetarium)
- Chicago Wildlife Watch (Liza Lehrer, Assistant Director, Urban Wildlife Institute, Lincoln Park Zoo)
- Gravity Spy (Sarah Allen, Zooniverse developer, supporting the Northwestern University LIGO team)
- Microplants (Matt Von Konrat, Head of Botanical Collections, Field Museum)
- Steelpan Vibrations (Andrew Morrison, Physics Professor, Joliet Junior College)
- Wikipedia Gender Bias (Emily Temple Wood, medical student, Wikipedia Editor, Zooniverse volunteer)
- In-Person Zooniverse Volunteer Opportunities at the Adler Planetarium (Becky Rother, Zooniverse designer)
Researchers spoke briefly about their projects and how they use the data and ideas generated by our amazing Zooniverse volunteers in their work. Emily spoke of her efforts addressing gender bias in Wikipedia. We then took questions from the audience and folks chatted in small groups afterwards.
The event coincided with Adler Planetarium’s biennial Member’s Night, so Zooniverse volunteers were able to take advantage of the museum’s “Spooky Space” themed activities at the same time, which included exploring the Adler’s spookiest collection pieces, making your own spooky space music, and other fun. A few of the Zooniverse project leads also led activities: playing Andrew’s steel pan drum, interacting with the Chicago Wildlife Watch’s camera traps and other materials, and engaging guests in classifying across the many Zooniverse projects. There was also a scavenger hunt that led Zooniverse members and Adler guests through the museum, playing on themes within the exhibit spaces relating to projects within the Zooniverse mobile app (iOS and Android).
We really enjoyed meeting our volunteers and seeing the conversation flow between volunteers and researchers. We feel so lucky to be part of this community and supporting the efforts of such passionate, interesting people who are trying to do good in the world. Thank you!
Have you hosted a Zooniverse meetup in your town? Would you like to? Let us know!
The following post is by Dr Brooke Simmons, who has been leading the Zooniverse efforts to help in the aftermath of the recent Caribbean storms.
This year has seen a particularly devastating storm season. As Hurricane Irma was picking up steam and moving towards the Caribbean, we spoke to our disaster relief partners at Rescue Global and in the Machine Learning Research Group at Oxford and decided to activate the Planetary Response Network. We had previously worked with the same partners for our responses to the Nepal and Ecuador earthquakes in 2015 and 2016, and this time Rescue Global had many of the same needs: maps of expected and observed damage, and identifications of temporary settlements where displaced people might be sheltering.
The Planetary Response Network is a partnership with many people and organizations and which uses many sources of data; the Zooniverse volunteers are at its heart. The first cloud-free data available following the storm was of Guadeloupe, and our community examined pre-storm and post-storm images, marking building damage, flooding, impassable roads and signs of temporary structures. The response to our newsletter was so strong that the first set of data was classified in just 2 hours! And as more imaging has become available, we’ve processed it and released it on the project. By the time Hurricane Maria arrived in the Caribbean, Zooniverse volunteers had classified 9 different image sets from all over the Caribbean, additionally including Turks and Caicos, the Virgin Islands (US and British), and Antigua & Barbuda. That’s about 1.5 years’ worth of effort, if it was 1 person searching through these images as a full-time job. Even with a team of satellite experts it would still take much longer to analyze what the Zooniverse volunteers collectively have in just days. And there’s still more imaging: the storms aren’t over yet.
We’ve been checking in every day with Rescue Global and our Machine Learning collaborators to get feedback on how our classifications are being used and to refresh the priority list for the next set of image targets. As an example of one of those adjustments, yesterday we paused the Antigua & Barbuda dataset in order to get a rapid estimate of building density in Puerto Rico from images taken just before Irma and Maria’s arrival. We needed those because, while the algorithms used to produce the expected damage maps do incorporate external data like Census population counts and building information from OpenStreetMaps, some of that data can be incomplete or out of date (like the Census, which is an excellent resource but which is many years old now). Our volunteers collectively provided an urgently needed, uniformly-assessed and up-to-date estimate across the whole island in a matter of hours — and that data is now being used to make expected damage maps that will be delivered to Rescue Global before the post-Maria clouds have fully cleared.
Even though the project is still ongoing and we don’t have full results yet, I wanted to share some early results of the full process and the feedback we’ve been getting from responders on the ground. One of our earliest priorities was St. Thomas in the USVI, because we anticipated it would be damaged but other crowdsourcing efforts weren’t yet covering that area. From your classifications we made a raw map of damage markings. Here’s structural damage:
The gray stripe was an area of clouds and some artifacts. You can get an idea from this of where there is significant damage, but it’s raw and still needs further processing. For example, in the above map, damage marked as “catastrophic” is more opaque so will look redder, but more individual markings of damage in the same place will also stack to look redder, so it’s hard to tell the difference in this visualization between 1 building out of 100 that’s destroyed and 100 buildings that all have less severe damage. The areas that had clouds and artifacts also weren’t completely unclassifiable, so there are still some markings in there that we can use to estimate what damage might be lurking under the clouds. Our Machine Learning partners incorporate these classifications and the building counts provided by our project as well as by OpenStreetMaps into a code that produces a “heat map” of structural damage that helps responders understand the probability and proportion of damage in a given area as well as how bad the damage is:
In the heat map, the green areas are where some damage was marked, but at a low level compared to how many buildings are in the area. In the red areas, over 60% of the buildings present were marked as damaged. (Pink areas are intermediate between these.)
With volunteer classifications as inputs, we were able to deliver maps like this (and similar versions for flooding, road blockage, and temporary shelters) for every island we classified. We also incorporated other efforts like those of Tomnod to map additional islands, so that we could keep our focus on areas that hadn’t yet been covered while still providing as much accurate information to responders as possible.
Feedback from the ground has been excellent. Rescue Global has been using the maps to help inform their resource allocation, ranging from where to deliver aid packages to where to fly aerial reconnaissance missions (fuel for flights is a precious commodity, so it’s critical to know in advance which areas most need the extra follow-up). They have also shared the heat maps with other organizations providing response and aid in the area, so Zooniverse volunteers’ classifications are having an extended positive effect on efforts in the whole region. And there has been some specific feedback, too. This message came several days ago from Rebekah Yore at Rescue Global:
In addition to supplying an NGO with satellite communications on St Thomas island, the team also evacuated a small number of patients with critical healthcare needs (including a pregnant lady) to San Juan. Both missions were aided by the heat maps.
To me, this illustrates what we can all do together. Everyone has different roles to play here, from those who have a few minutes a day to contribute to those spending hours clicking and analyzing data, and certainly including those spending hours huddled over a laptop in a temporary base camp answering our emailed questions about project design and priorities while the rescue and response effort goes on around them. Without all of them, none of this would be possible.
We’re still going, now processing images taken following Hurricane Maria. But we know it’s important that our community be able to share the feedback we’ve been receiving, so even though we aren’t finished yet, we still wanted to show you this and say: thank you.
Now that the project’s active response phase has completed, we have written a further description of how the maps our volunteers helped generate were used on the project’s Results page. Additionally, every registered volunteer who contributed at least 1 classification to the project during its active phase is credited on our Team page. Together we contributed nearly 3 years’ worth of full-time effort to the response, in only 3 weeks.
The Planetary Response Network has been nurtured and developed by many partners and is enabled by the availability of pre- and post-event imagery. We would like to acknowledge them:
- Firstly, our brilliant volunteers. To date on this project we have had contributions from about 10,000 unique IP addresses, of which about half are from registered Zooniverse accounts.
- The PRN has been supported by Imperative Space and European Space Agency as part of the Crowd4Sat programme. Any views expressed on this website shall in no way be taken to represent the official opinion of ESA.
- The development of the current Zooniverse platform has been supported by a Google Global Impact award and the Alfred P. Sloan Foundation.
- We are grateful to Patrick Meier and QCRI for their partnership in the development of PRN.
- We are grateful to those whose counsel (and data!) we have been fortunate to receive over the years: the Humanitarian OpenStreetMap Team, the Standby Task Force, Tomnod.
- We are grateful to our imagery providers:
- Planet has graciously provided images to the PRN in each of our projects. (Planet Team 2017 Planet Application Program Interface: In Space For Life on Earth. San Francisco, CA. https://api.planet.com, License: CC-BY-SA)
- DigitalGlobe provides high-resolution imagery as part of their Open Data Program (Creative Commons Attribution Non Commercial 4.0).
- Thanks to the USGS for making Landsat 8 images publicly available.
- Thanks to ESA for making Sentinel-2 images publicly available.
- Thanks to Amazon Web Services’ Open Data program for hosting Sentinel-2 and Landsat 8 images, both of which were used in this project (and sourced via AWS’ image browser and servers);
- We’d also like to thank several individuals:
- Everyone at Rescue Global, but particularly Hannah Pathak and Rebekah Yore for patiently answering our questions and always keeping the lines of communication open;
- Steve Reece in Oxford’s ML group for burning the midnight oil;
- The Zooniverse team members, who are absolute stars for jumping in and helping out at a moment’s notice.
The Zooniverse has again been asked to enable The Planetary Response Network – this time in response to Hurricane Irma. Irma has brought widespread devastation to many islands in the Caribbean over the last few days, and now Hurricane Jose is a growing threat in the same region. By analysing images of the stricken areas captured by ESA’s Sentinel-2 satellites, Zooniverse volunteers can provide invaluable assistance to rescue workers. Rescue Global are a UK-based disaster risk reduction and response charity who are deploying a team to the Caribbean and will use the information you provide to help them assess the situation on the ground. The last time The Planetary Response Network was brought online was to help in the aftermath of the 2016 Ecuador Earthquake. Back then over two thousand volunteers helped analyse almost 25,000 square kilometres of satellite imagery in only 12 hours, and we hope to be of help this time too! Right now we have limited clear-sky images of the affected area, mostly around Guadeloupe, but we are working hard to upload images from the other islands as soon as possible. Join the effort right now at www.planetaryresponsenetwork.org.
Below is the first in a series of guest blog posts from researchers working on one of our recently launched biomedical projects, Etch A Cell.
Read on to let Dr Martin Jones tell you about the work they’re doing to further understanding of the universe inside our cells!
Having trained as a physicist, with many friends working in astronomy, I’ve been aware of Galaxy Zoo and the Zooniverse from the very early days. My early research career was in quantum mechanics, unfortunately not an area where people’s intuitions are much use! However, since I found myself working in biology labs, now at the Francis Crick Institute in London, I have been working in various aspects of microscopy – a much more visual enterprise and one where human analysis is still the gold standard. This is particularly true in electron microscopy, where the busy nature of the images means that many regions inside a cell look very similar. In order to make sense of the images, a person is able to assimilate a whole range of extra context and previous knowledge in a way that computers, for the most part, are simply unable to do. This makes it a slow and labour-intensive process. As if this wasn’t already a hard enough problem, in recent years it has been compounded by new technologies that mean the microscopes now capture images around 100 times faster than before.
Focused ion beam scanning electron microscope
Ten years ago it was more or less possible to manually analyse the images at the same rate as they were acquired, keeping the in-tray and out-tray nicely balanced. Now, however, that’s not the case. To illustrate that, here’s an example of a slice through a group of cancer cells, known as HeLa cells:
We capture an image like this and then remove a very thin layer – sometimes as thin as 5 nanometres (one nanometre is a billionth of a metre) – and then repeat… a lot! Building up enormous stacks of these images can help us understand the 3D nature of the cells and the structures inside them. For a sense of scale, this whole image is about the width of a human hair, around 80 millionths of a metre.
Zooming in to one of the cells, you can see many different structures, all of which are of interest to study in biomedical research. For this project, however, we’re just focusing on the nucleus for now. This is the large mostly empty region in the middle, where the DNA – the instruction set for building the whole body – is contained.
By manually drawing lines around the nucleus on each slice, we can build up a 3D model that allows us to make comparisons between cells, for example understanding whether a treatment for a disease is able to stop its progression by disrupting the cells’ ability to pass on its genetic information.
Animated gif of 3D model of a nucleus
However, images are now being generated so rapidly that the in-tray is filling too quickly for the standard “single expert” method – one sample can produce up to a terabyte of data, made up of more than a thousand 64 megapixel images captured overnight. We need new tricks!
Why citizen science?
With all of the advances in software that are becoming available you might think that automating image analysis of this kind would be quite straightforward for a computer. After all, people can do it relatively easily. Even pigeons can be trained in certain image analysis tasks! (http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0141357). However, there is a long history of underestimating just how hard it is to automate image analysis with a computer. Back in the very early days of artificial intelligence in 1966 at MIT, Marvin Minsky (who also invented the confocal microscope) and his colleague Seymour Papert set the “summer vision project” which they saw as a simple problem to keep their undergraduate students busy over the holidays. Many decades later we’ve discovered it’s not that easy!
Our project, Etch a Cell is designed to allow citizen scientists to draw segmentations directly onto our images in the Zooniverse web interface. The first task we have set is to mark the nuclear envelope that separates the nucleus from the rest of the cell – a vital structure where defects can cause serious problems. These segmentations are extremely useful in their own right for helping us understand the structures, but citizen science offers something beyond the already lofty goal of matching the output of an expert. By allowing several people to annotate each image, we can see how the lines vary from user to user. This variability gives insight into the certainty that a given pixel or region belongs to a particular object, information that simply isn’t available from a single line drawn by one person. Difference between experts is not unheard of unfortunately!
The images below show preliminary results with the expert analysis on the left and a combination of 5 citizen scientists’ segmentations on the right.
Example of expert vs. citizen scientist annotation
In fact, we can go even further to maximise the value of our citizen scientists’ work. The field of machine learning, in particular deep learning, has burst onto the scene in several sectors in recent years, revolutionising many computational tasks. This new generation of image analysis techniques is much more closely aligned with how animal vision works. The catch, however, is that the “learning” part of machine learning often requires enormous amounts of time and resources (remember you’ve had a lifetime to train your brain!). To train such a system, you need a huge supply of so-called “ground truth” data, i.e. something that an expert has pre-analysed and can provide the correct answer against which the computer’s attempts are compared. Picture it as the kind of supervised learning that you did at school: perhaps working through several old exam papers in preparation for your finals. If the computer is wrong, you tweak the setup a bit and try again. By presenting thousands or even millions of images and ensuring your computer makes the same decision as the expert, you can become increasingly confident that it will make the correct decision when it sees a new piece of data. Using the power of citizen science will allow us to collect the huge amounts of data that we need to train these deep learning systems, something that would be impossible by virtually any other means.
We are now busily capturing images that we plan to upload to Etch a cell to allow us to analyse data from a range of experiments. Differences in cell type, sub-cellular organelle, microscope, sample preparation and other factors mean the images can look different across experiments, so analysing cells from a range of different conditions will allow us to build an atlas of information about sub-cellular structure. The results from Etch a cell will mean that whenever new data arrives, we can quickly extract information that will help us work towards treatments and cures for many different diseases.