Apache Crunch
· Apache Crunch is a Java library for writing, testing, and running
Hadoop MapReduce pipelines, based on Google's FlumeJava. Its goal
is to make pipelines that are composed of many user-defined functions
simple to write, easy to test, and efficient to run.