You are here: Home News & Events All SEAS Calendars CSEE Theory of Computation Seminars A Model of Computation for MapReduce, Sergei Vassilvitskii, Yahoo Research
Navigation
Search
Advanced Search…

A Model of Computation for MapReduce, Sergei Vassilvitskii, Yahoo Research

In recent years the MapReduce framework has emerged as one of the most widely used parallel computing platforms for processing data on the terabyte and petabyte scales. Used daily at companies such Yahoo!, Google, Amazon, and Facebook and adopted more recently by several universities, it allows for easy parallelization of data intensive computations over many machines. One key feature of MapReduce that differentiates it from previous models of parallel computation is that it interleaves sequential and parallel computation. We propose a model of efficient computation using the MapReduce paradigm. Our model allows each machine to perform sequential computations in time polynomial in the size of the input that machine receives. Moreover, since MapReduce is designed for computations over massive data sets, our model limits the number of machines and the memory per machine to be (substantially) sublinear in the size of the input. We compare MapReduce to the PRAM model of computation. We prove a simulation lemma showing that a large class of PRAM algorithms can be efficiently simulated via MapReduce. The strength of MapReduce, however, lies in the fact that it uses both sequential and parallel computation. We show how algorithms can take advantage of this fact to compute an MST of a dense graph in only two rounds, as opposed to $O(\log(n))$ rounds needed by a standard PRAM. We also show to evaluate a wide class of functions using the MapReduce framework. We conclude by applying this result to show how to compute many basic algorithmic problems such as undirected connectivity in the MapReduce framework. Upcoming TOC seminars and related talks: --------------------------------------------------- Mon 11/23 4:15pm: Eddie Farhi physics colloquium 12/2: Emanuele Viola, Northeastern University

What
    When Nov 18, 2009
    from 12:00 pm to 01:00 pm
    Where Maxwell Dworkin 319
    Add event to calendar vCal
    iCal