Generate-Test-and-Aggregate is a class of algorithms that can automatically derive efficient MapReduce programs.
MapReduce is a useful and popular programming model for large-scale parallel processing. However, for many complex problems, it is usually not easy to develop the efficient parallel algorithms that match MapReduce paradigm well.
The generator-based parallelization approach has been developed and introduced to simplify parallel programming by its automatic generating and optimizing mechanism. Efficient parallel algorithms can be generated from users' naive but correct programs by making use of generators which exploit knowledge of optimization theorems in the field of skeletal parallel programming. The obtained efficient-parallel algorithms are in the form that very fit for implementation with MapReduce.
By such an approach, a large class of generate-and-test-like computations can be efficiently programmed and computed over MapReduce. Thus a novel programming interface and framework can be built on top of MapReduce, and that would be helpful for resolving the difficulties on programmability and efficiency. In this paper we will introduce a framework that has such a novel programming interface for MapReduce. With this framework, users can just concentrate on making naive correct programs. We will show that a lot of so-called generate-and-test-like computations can be easily and efficiently implemented by this framework over MapReduce.