IACS Seminar: "Predicting Public Transit Delays: Designing a Data-Driven App for Caltrain"

12 Feb
IACS Seminar Series
Jeff Yau & Harrison Mebane, Silicon Valley Data Science
Friday, February 12, 2016 - 12:30pm to 2:00pm
Maxwell Dworkin G115, 33 Oxford Street, Cambridge MA 02138

Lunch 12:30pm; Talk 1pm

Abstract: In this talk, we will discuss the architecture and the analytics of a complete mobile app designed to track and predict train delays in real-time. The data platform ingests and processes data from Twitter, the Caltrain web API, and GPS, using Spark, Kafka, and HBase, among other tools. Development is done in Docker containers, which allow for testing without the need for a separate cluster. The analytics includes image processing, sentiment analysis, and a various statistic and machine learning techniques used in an attempt to predict train delays.

Institute for Applied Computational Science (IACS)
Natasha Baker