A group of Harvard students got some help with their homework from an unlikely source: the world's fifth most powerful supercomputer.

The students, enrolled in “Extreme Computing: Project-based High Performance Distributed and Parallel Systems” (AC 290r), a new course at the Harvard John A. Paulson School of Engineering and Applied Sciences, had the opportunity to harness one of the world’s most powerful computers and worked alongside experts from industry and the National Labs to tackle real-world problems using techniques from the field of "extreme computing."

A relatively new field, "extreme computing” applies predictive computer simulations on almost unimaginably large scales to find solutions for the world’s most complex problems.

Data may be just what the doctor ordered

“Biology and health care are rapidly becoming data sciences,” said Sean Davis, staff scientist at the National Cancer Institute (NCI).

Davis and colleague Eric Stahlberg, director of high performance computer strategy at NCI, co-taught a module on extreme computing’s role in bioinformatics. With examples from the field of cancer genomics, the pair helped students understand the importance of data science in biomedical research.

This image shows a cluster of slow-cycling breast cancer cells (red) within a human ER+ primary breast tumor (cell nuclei in blue; rapidly cycling cancer cells in green). Extreme computing continues to play a larger role in the field of cancer genomics. Photo courtesy of the National Cancer Institute/Dana-Farber Harvard Cancer Center at Massachusetts General Hospital.

The Human Genome Project, an international research effort that sought to determine the sequence of chemical base pairs that make up human DNA, began in 1990 and took 13 years and more than $1.5 billion to complete. Today, thanks to rapid increases in computing power, the same sequencing has been shortened to days and costs about $1,000, Davis said.

This enabling technology has sparked a revolution in biomedical research. Today, researchers are sequencing thousands of patients and generating enormous quantities of data. But those experimental results are useless unless scientists can translate raw data into understandable health information.

The Harvard students used extreme computing to investigate the relationship between DNA mutations in 60 different cancer cell lines and associated drug response data. They first had to define the candidate DNA mutations in the cancer cell lines using parallel computing strategies. Then, they applied machine learning models to determine the DNA mutations that might serve as predictors of drug response. With tens of millions of sequences per cell line, that analysis required high-performance and parallel computing strategies.

The analysis and resulting predictive models the students built took about 1,000 compute hours. Future studies will require data-intensive extreme computing and involve thousands or even millions of samples. These kinds of simulations could enable physicians to apply predictive models to patients, providing more personalized medicine.

“Our ability to produce data is growing faster than our ability to analyze it,” Davis said. “And the analysis requires not just the computer, but the people who understand how to design systems to turn the data into information.”

Akhil Ketkar, S.M. ’16, who is pursuing a degree in computational science and engineering, was surprised by the large role of data science in the medical field.

“I learned that biological systems are so incredibly complex,” he said. “I was amazed by the levels of complexity involved in cell reproduction and got a good grasp of why we need all this data to do successful research.”

Building a better battery

What does a battery have in common with DNA? Complexity, for one. Batteries operate by creating a flow of charge via ions traveling through highly heterogeneous materials. Effectively modeling the function of so many tiny particles, including essential physics models, requires massive computing power.

Chris Knight, assistant computational scientist at Argonne National Laboratory, coached SEAS students through a course module that enabled them to use Mira, the fifth most powerful supercomputer in the world, to study battery function.

Mira, the fifth most powerful supercomputer in the world, enables teams of researchers to answer complex questions in months, rather than decades. (Photo courtesy of Argonne National Laboratory.)

In addition to understanding how ions, such as manganese and lithium, diffuse through battery electrolyte, electrodes, and solid electrolyte interfaces, it’s also critical to understand their behavior. Finding ways to improve battery life is complicated, and understanding how ions move back and forth along complex pathways within a battery is one piece of a grand puzzle, Knight said.

Mira, which has a peak speed of 10 million billion operations per second and 786,000 gigabytes of memory, enables researchers to find solutions to extremely complex problems in months, rather than decades, explained Paul Messina, senior strategic advisor at the Argonne Leadership Computing Facility.

“Movies give the impression that, in a few hours, one or two people can use a supercomputer to solve anything,” he said. “In reality, carrying out computational science and engineering takes teams of highly trained, multidisciplinary scientists working for long periods on very expensive machines.”

The students used Mira to experiment with different electrode protective layers that could decrease manganese diffusion without inhibiting lithium diffusion. They ran simulations to determine factors that affect the dissolution of manganese and how both ions interact with the battery environment.

“I hope the students learned that the concept of designing better materials is not so cut and dry. It requires a lot of thinking, intuition, and large-scale computing resources,” Knight said.

Vinay Subbiah, S.M. ’16, pursing a degree in computational science, said the biggest challenge of the battery-modeling lab was getting up to speed on the different simulation techniques.

“But the fact that we had the freedom to design our own experiments and test our hypotheses instantly, due to the computing power at hand, made it quite enjoyable, as we could iterate very quickly,” he said.

Preparing students for the future of computing

Extreme computing is poised to become even more critical in the future, said Sadasivan Shankar, Margaret and Will Hearst Visiting Lecturer in Computational Science and Engineering.

Classmates Christian Juuge and Isadora Nun work through a complex problem using extreme computing techniques. (Photo by Adam Zewe/SEAS Communications.)

Shankar taught the course in collaboration with Pavlos Protopapas, scientific program director for the Institute for Applied Computational Science, and Efthimios Kaxiras, John Hasbrouck Van Vleck Professor of Pure and Applied Physics. The course also included a module on extreme computing applications in a social science setting, focusing on customer reviews and influencing.

The goal of the course is not only to teach students to use the hardware and middleware necessary for extreme computing, but also to showcase the power of computing and its relevance in nearly every aspect of modern life.

“The students received unique opportunities to see that computing can address real problems in society, health, and energy,” Shankar said. “Our hope is that some of the students in this class will be excited by the possibilities and will move on to these areas to pursue in their careers.”