Lunch 12:30pm; Talk 1pm
Abstract: In this talk, we will discuss the architecture and the analytics of a complete mobile app designed to track and predict train delays in real-time. The data platform ingests and processes data from Twitter, the Caltrain web API, and GPS, using Spark, Kafka, and HBase, among other tools. Development is done in Docker containers, which allow for testing without the need for a separate cluster. The analytics includes image processing, sentiment analysis, and a various statistic and machine learning techniques used in an attempt to predict train delays.