Our galaxy, the Milky Way, is on a crash course with the neighboring Andromeda galaxy. The collision won’t occur for around 5 billion years but it’s possible that such crashes, or mergers, play some part in “quenching” star-forming, spiral galaxies, turning them into dead elliptical galaxies. The theory suggests that this process takes place, but scientists are still in the process of obtaining observational evidence to support it.
Mergers aren’t the only possible reason that galaxies stop forming stars. Supernovae explosions and supermassive black holes (known as Active Galactic Nuclei or AGN) are also on the suspect list. So how to solve this intergalatic whodunnit? Astronomer Laura Trouille has mustered a posse of citizen scientists to help find out.
Finding motivated members of the public to participate in a crowdsourced science project wasn’t going to be sufficient to make the project successful. “We really needed to put the technology in place that would enable and support what people were doing,” says Trouille. “It’s really important to any science collaboration to be able to discuss the plots, analyze them together as a team, and move forward as a team. Those tools needed to be in place before we could even carry out an experiment like this.” So the team developed GitHub-inspired software called Zooniverse Tools to let citizen and “salaried” scientists collaborate.
Trouille is an astronomer working at Northwestern University and the Adler Planetarium and her groundbreaking project is called Galaxy Zoo Quench. In certain fields like astronomy, there is just too much data for professional scientists to analyze, even with help from automated tools. Some problems require a human review step before certain types of analysis can begin.
The original Galazy Zoo project, which started in 2007, harnessed human beings to classify images of galaxies, for example to identify whether a galaxy is spherical or elliptical. 150,000 people contributed 50 million classifications in the project’s first year. Some project participants didn’t stop at classification. A Dutch scoolteacher named Hanny van Arkel noticed a set of small, green round objects, which fellow participants helped to analyze. These previously unknown galaxies later became known as green peas. Van Arkel also discovered another new type of object, a ghost remnant of an active black hole interacting with the gas it stripped away from its original host galaxy. It’s now known as Hanny’s Voorwerp (Hanny’s object).
Galaxy Zoo’s intrepid citizen scientists organized themselves via discussion boards to work on their discoveries. Galaxy Zoo Quench wants to formally involve them in the entire scientific process. Volunteers not only participate by classifying galaxies, but analyze the results in collaboration with the professional science team and will write a scientific paper based on the results.
To make all this possible, the scientists, salaried and otherwise, needed new software. Zooniverse had access to lots of additional data about galaxies which wasn’t necessary to run the core Galaxy Zoo classification project. Being able to query and analyze this data would allow participants to independently investigate questions like the correlation between galaxies of a particular type and the age of the black holes at their centers.
Arfon Smith is one of the founders of Zooniverse, the website which hosts Galaxy Zoo and dozens of other citizen science projects, and he also served as its CTO. “We had nowhere to expose that information to people,” says Smith. “So out of this came a desire to build an environment where you could start to delve into the some of the data behind the projects and that’s basically what Tools is. There are lots of common tasks: plotting, tables, maps. When you have grabbed some data and plotted it, the dashboard is shareable within the community. Anyone who gets that URL gets a clone of that thing. There’s always going to be a point where you want to pull down the data and do something in R or Matlab, but there’s a whole bunch you can do that’s quick and easy and will help you understand if what you are looking at is interesting. That’s where Tools gets us to.”
Zooniverse was already using MongoDB to store the the heterogeneous data required for its various projects and exposed a common API which was used by the projects. Zooniverse Tools was built on top.
Galaxy Zoo Quench is using Zooniverse Tools and a collaborative writing environment called Authorea (also used by physicists at CERN) to enable its scientific work. The research issue being addressed is of a similar level of difficulty to a Masters project which might form the starting project for a PhD.
The first stage, like Galaxy Zoo, was classification. Post-quenched galaxies are rare but Galaxy Zoo Quench created a sample of 3,002 quenched galaxies and a control sample of the same number of non-quenched galaxies with a similar total stellar mass and redshift (the light from an object moving away from the observer appears to increase in wavelength shifting to the red end of the spectrum). 1,200 participants answered a series of questions helping to identify the characteristics of the galaxy’s morphologies. A smaller group of around 250 participants then started to analyze them using Zooniverse Tools.
There was a huge range in the ability and experience of the volunteers when it came to exploring the data. Some volunteers seemed to be immediately comfortable. Others needed guidance, which the science team provided via a series of tutorials and summaries of relevant scientific papers.
“Anecdotally, we have a mixture of people who have careers in a technical field but they were not astronomers,” says Trouille. “We have one person who has a really deep understanding of statistical analysis, so he has been an integral part of the research team. There’s quite a good population of stay-at-home mothers who have always been interested in science and if they have an hour here or there they get online.”
The citizen scientists spent several months exploring the data and sharing their findings with the salaried scientists and each other on the project’s discussion boards. Then one of them stumbled across something interesting.
“One of my favorite examples is someone who didn’t have a background in the technical side,” says Trouille. ”She posted one of our major plot results. The X axis is galaxy mass. The Y axis shows whether the galaxy has signatures of having collided with another galaxy. From her data you could see that there’s a very weird statistical difference between our post-quenched galaxy samples, especially at the higher mass end. Those galaxies have a much higher likelihood of having undergone a major merger. It’s a clear suggestion that major mergers are one mechanism for shutting off star formation.” The team is starting work on a research paper documenting their results.
One intruiging issue for future Zooniverse projects is how to best combine automated techniques like Machine Learning with human input, a field known as Social Computing. In fact, at least one Zooniverse project, Galaxy Zoo Supernova, has already automated itself out of existence.
“They would survey the sky every night with a telescope in Panama and they’d get about 50,000 candidates things which could be supernovae, “ says Smith. “They’d cull about 49,000 of those and were left with around a few hundred to a thousand every night. It was a mixture of supernovae, variable stars, asteroids. All of these things look like supernovae unless your algorithms are incredibly good, and they often rely on a lot of additional information. That project after about a year and a half put itself out of business, because the team used the dataset from the project to train their algorithms to behave like the Zoo project.”
There are some problems, however, machine and human brainpower can be combined to get the best results. The Planethunters project, for example, uses data from the Kepler satellite to find new planets. When a planet passes in front of a star, there’s a dip in the star’s brightness. There are accurate computer algorithms which can find elliptical planetary systems but citizen scientists have found more exotic objects like Planet Hunters 1 (PH1) which the computer algorithms didn’t recognize. “What’s definitely interesting for the future is the human computer interaction,” says Trouille. “You have your computer algorithms but then have human eyes look through the data to see what has been missed and teach the algorithms what the humans have learnt.”
Another strategy is to have an algorithm optimize the allocation of data to the human volunteers. A paper from Microsoft research used the original Galazy Zoo data but developed a dynamic model of the ability level and availability of the various volunteers to determine the optimal allocation of images to volunteers. The researchers estimated that the same classification fidelity could be achieved with only 25% of the classification events from the original project. Zooniverse plans to use the system, which is called CrowdSynth, in a future Zooniverse project.
But for now the focus of Galaxy Zoo Quench is to complete the last stage of its collaboration between citizen and salaried scientists by writing a scientific paper and having it accepted by a peer-reviewed journal. “As with any new experiment, you have no idea whether it’s going to work,” says Trouille, “whether people are interested, have the ability and motivation, so that’s been a very happy surprise, that there is clearly interest both on the science teams and very much on the public side to take part in the whole process of science.”