Alumni News Brief

Synthesizing San Francisco

Casser turns still images into three-dimensional reconstruction of neighborhood

By Matt Goisman | Press contact

August 30, 2022

Facebook Twitter Email LinkedIn

Autonomous vehicle companies typically rely on a combination of real-world testing and simulation to advance their systems. To bridge the gap between the two, a lot of effort is put into creating advanced, virtual, 3D reconstructed counterparts of the areas in which autonomous vehicles operate. One way to create such reconstructions is to process still imagery captured by cameras into what’s known as a neural radiance field (NeRF) – essentially creating a three-dimensional image that can be viewed from all angles, including ones missing from the collection of images.

NeRFs work well for rendering objects on a small scale, but up until recently couldn’t easily be scaled up to reconstruct a neighborhood, let alone a whole city. Rendering on that scale exceeded the capability and capacity of current approaches, as well as required too many images, too much data, and excessive computing capability.

A recent graduate of the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS), Vincent Casser, M.E. ‘19, has developed a patented new method for rendering NeRFs on a scale large enough to work for autonomous vehicle simulation. Together with his teammates at Waymo, they have used 2.8 million images to create a three-dimensional representation of the Alamo Square neighborhood in San Francisco, and showed how his “Block-NeRF” method could be scaled up even further to render the entire city.

Casser is a senior research scientist at Waymo, an autonomous vehicles subsidiary of Alphabet, Google’s parent company.

“Autonomous vehicle companies have a natural limit in real-world testing,” Casser said. “Even if you’ve logged millions or even billions of miles of real-world testing, it’s important to have simulation capabilities as well so you can test the limits in every way you may want -- and do it cheaply and at scale. There’s still a disconnect between simulation and real-world appearance and behavior, so generally the closer you can get in terms of quality and scalability, the better.”

Casser’s research enables scalability by breaking down large-scale scenes into much smaller units called blocks. Each block covers an area approximately the size of one street intersection. By rendering each block as a separate NeRF, the overall rendering time is reduced, the potential size of the overall rendering is increased, and each block can then be replaced or manipulated independently to reflect changes in the environment such as construction work, or different environmental conditions, such as darkness or rain.

“It started off a little crude, and there were a lot of challenges to overcome,” Casser said. “We realized the potential when we really put the entire neighborhood together and had the ability to render it out and have what seemed like infinite stretches of street to traverse.”

Block-neRF rendering of a city street — An example of a Block-NeRF rendering of a city street. (U.S. patent application 63/285,980)

Collecting enough images to render a city remains a large challenge to Block-NeRF. It took 2.8 million images just to reconstruct Alamo Square. But cameras don’t need to be particularly advanced to capture the required images, as shown in related neural rendering works performing reconstruction from consumer-grade cameras and even cell phone imagery.

“Everyone has a camera on their phone,” Casser said. “It’s pretty easy to strap a camera onto a vehicle and capture images in this way. This is different from traditional mapping, where you use a laser sensor that can be more expensive and requires specialized expertise to maintain and calibrate.”

Waymo started its public, fully autonomous service in Phoenix in 2020, and has since expanded its operations to San Francisco to select members of the public. This novel Block-NeRF method could help Waymo more quickly bring its autonomous driving technology to more cities.

“You drive around capturing images, use them to build an entire 3D environment of the city, and then you can simulate certain aspects before ever driving there with an autonomous vehicle,” he said. “You can identify some problems and work out solutions before you’ve had a single car actually engaged in the city.”

Casser studied computational science and engineering at the Institute for Applied Computational Science at SEAS. He joined Google as a robotics research intern during his program at SEAS, then went to work fulltime for Waymo following graduation.

“Autonomous vehicles seemed so impactful in so many different ways, including helping people with disabilities, safer transportation, improved mobility, sustainability and the environment, and urban planning,” Casser said. “I found that potential really motivating, and it also makes me really happy to see the product. When you work in robotics, you have these physical embodiments of your work. I use our Waymo One service regularly when I’m in San Francisco, and it really motivates me to see how our system just keeps improving.”

Press Contact

Matt Goisman | mgoisman@g.harvard.edu