MapReduce-MPI Library

A grain of wisdom is worth an ounce of knowledge, which is worth a ton of data. -- Neil Larson It is a capital mistake to theorize before one has data. -- Arthur Conan Doyle

This is the home page for the MapReduce-MPI (MR-MPI) library, which is an open-source implementation of MapReduce written for distributed-memory parallel machines on top of standard MPI message passing.

Features	Documentation	Library functions	OINK scripting wrapper	OINK commands	Publications
Download	GitHub	Latest features & bug fixes	Contribute	Open source	.

MapReduce is the programming paradigm, popularized by Google, which is widely used for processing large data sets in parallel. Its salient feature is that if a task can be formulated as a MapReduce, the user can perform it in parallel without writing any parallel code. Instead the user writes serial functions (maps and reduces) which operate on portions of the data set independently. The data-movement and other necessary parallel operations can be performed in an application-independent fashion, in this case by the MR-MPI library.

The MR-MPI library was developed at Sandia National Laboratories, a US Department of Energy facility, for use on informatics problems. It includes C++ and C interfaces callable from most hi-level languages, and also a Python wrapper and our own OINK scripting wrapper, which can be used to develop and chain MapReduce operations together. MR-MPI and OINK are open-source codes, distributed freely under the terms of the modified Berkeley Software Distribution (BSD) License. See this page for more details.

The authors of the library are Steve Plimpton and Karen Devine, who can be contacted at sjplimp at sandia.gov and kddevin at sandia.gov.

These are other software packages that perform MapReduce operations:

original Google MapReduce paper by J Dean and S Ghemawat
Hadoop
Disco - an implementation of MapReduce in Erlang
Meguro - a Javascript MapReduce framework
Bashreduce - an implementation of MapReduce in the bash shell (no kidding)

Recent MR-MPI News

(3/11) Release of graph algorithms from our new paper as OINK commands.
(2/11) Release of OINK, a scripting wrapper on the MR-MPI library.
(3/10) Release of out-of-core version of the MR-MPI library.
(4/09) Initial open-source release of the MR-MPI library.