Crowdsourcing science

In panel discussion, researchers envision a "world wide lab" of engaged citizen scientists

By Caroline Perry

November 14, 2013

Facebook Twitter Email LinkedIn

Traditional social science research tends to skew toward "WEIRD" subjects—that is, toward the Western, educated, industrialized, rich, and democratic—according to four Harvard researchers who are trying to expand the reach of modern data collection and analysis.

Pioneers in the field of crowdsourced, web-based research, they offered a vision of large-scale citizen science experiments in a November 14 panel discussion titled "Taking Research Out Into the Wild." Part of the Computer Science Colloquium series, the event was hosted by the Center for Research on Computation and Society (CRCS) at Harvard School of Engineering and Applied Sciences (SEAS).

It is not unusual for academics to involve the public in research—think of the National Audubon Society’s Christmas Bird Count and the Harvard-led Personal Genome Project. With the advent of the web and social media, however, researchers can now learn from massive swaths of the world population, instantly, without ever bringing them into the lab. And the payoffs for both parties can be significant.

Yet web-based crowdsourcing “is not part of a typical researcher’s toolkit, and there are many reasons for it,” noted organizer and panelist Krzysztof Gajos, associate professor of computer science. “Some relate to the fears of effort and feasibility: ‘If I build it will they come? Do I have the skills for it? Will it require infrastructure to make it happen?’ Some of them have to do with reliability and the value of the data collected in such settings.”

Three projects discussed at the CRCS event could serve as proofs-of-concept to allay these types of fears, Gajos said: “They have overcome the barriers of feasibility and they have each demonstrated the unique value that such research can bring to the disciplines.”

Leading the Intelligent Interactive Systems group at SEAS, Gajos studies how humans interact with computational systems, and how future systems can accommodate the users' varying abilities, limitations, and preferences algorithmically. Gajos and Katharina Reinecke, a postdoctoral researcher in his group, created an experimental platform called Lab in the Wild that administers game-like tests to volunteers online.

One of the group’s recent findings is that, statistically speaking, people of different nationalities often have very different design preferences—something Reinecke inadvertently discovered years ago when she tried to design a website for agricultural advisers in Rwanda. “I had just finished my master’s, so I thought I knew everything,” she recalled, “But it didn’t appeal to them at all, what I proposed.”

Today, Lab in the Wild collects participants’ data on those types of cultural differences, mapping them to a host of demographic categories and potentially revealing best practices for universal web design.

Another test, which guesses the user’s age based on clicking speed, collects important data on motor skills and human development, as well as cultivating public curiosity about science. That test attracted more than a million participants in its first two months.

Not merely limited to computer scientists, computational crowdsourcing techniques can affect many disciplines—psychology, for one.

“It’s something that in the field of human research is potentially very transformative,” said panelist Laura Germine, director, co-founder, and principal investigator of the TestMyBrain project, based in Harvard’s Department of Psychology. Germine earned her Ph.D. in experimental psychopathology from Harvard in 2012 and is now a postdoctoral researcher in the Psychiatric and Neurodevelopmental Genetics Unit at the Harvard-affiliated Massachusetts General Hospital.

Laura Germine describes the TestMyBrain project. (Photo by Eliza Grinnell, SEAS Communications.)

Germine initially launched a prototype of the online platform with Dartmouth’s Brad Duchaine, then a postdoctoral fellow at Harvard, as a means of studying a developmental disorder called prosopagnosia, or “face blindness.”

“If you can’t recognize your kids or your spouse, you don’t need to take a test to figure out that you have a face-recognition problem,” Germine said. “But sometimes just the confirmation is important to people.”

And it turned out the demand was high—really high. Within two weeks of launching that test, they had 10,000 participants.

“The thing is that most of these people were not there to find out if they had developmental prosopagnosia,” Germine explained. “They were there just because they thought it was fun to test themselves and find out how good their face recognition is. So then we realized: there’s an opportunity here.”

Germine later launched the TestMyBrain site at Harvard with Ken Nakayama, Edgar Pierce Professor of Psychology, in January 2008. The project has since expanded to include a wide range of other tests that investigate cognition, personality, temperament, and psychiatric vulnerability. Benefiting from public interest in psychology and self-discovery, TestMyBrain has collected data from over 850,000 participants since 2008. In essence, each participant is a collaborator in the research.

“By giving that person the tools so that they can assess themselves, we both gain that knowledge, and it fuels research in a really huge way,” said Germine.

Panelist Edith Law, a CRCS postdoctoral fellow at SEAS, is the founder of Curio, a website that brings teams of experts together with “the crowd” to collaborate on large-scale science projects that could not be performed from a single location—such as counting the number of bees on sunflowers across the country. Researchers can also enlist the public’s help with simple descriptive and analytical tasks.

As with any research involving human subjects or the sharing of personal data, institutional review boards play an important role in setting ground rules that protect the participants.

“There’s some data that’s just not shareable, period,” Law said.

Lab in the Wild, TestMyBrain, and Curio are among the very first web-based “citizen science” platforms, but already they offer a promising model for future large-scale scientific studies.

“These projects happened because of the willingness to take a risk, and vision on the part of these people, so the message here is that this is not impossible,” said Gajos. “This is within the reach of most people. It takes a bit of experimentation, and these people have enormous expertise that will make it much easier for others to get started. We hope that these types of methodologies will become more pervasive.”

Topics: Computer Science