MR-MPI WWW Site -MR-MPI Documentation - OINK
Documentation - OINK Commands

### cc_find command

**Syntax:**

cc_find nthresh -i in1 -o out1.file out1.mr

- nthresh = threshhold size at which components are split across processors
- in1 = graph edges: Key = Vi Vj, Value = NULL
- out1 = components: Key = Vi, Value = Ci

**Examples:**

cc_find 1000 -i mre -o cc.list cc

**Description:**

This is a named command which finds all the connected components (CCs)
in an undirected graph. A connected component is a set of vertices
where each is connected to one or more other vertices in the set via
an edge. The CCs are found via the MapReduce algorithm of
(Cohen) discussed in his paper with extensions described in
the paper of (Plimpton) which attempt to load-balance the
calculation across processors when one or a few very large components
exist in the graph.

See the named command doc page for various ways in which
the -i inputs and -o outputs for a named command can be specified.

In1 stores a set of edges, assumed to have no duplicates or
self-edges. This means that either (Vi,Vj) or (Vj,Vi) appears, but
not both. Also (Vi,Vi) does not appear. The input is unchanged by
this command.

Out1 will store the assignment of each vertex in the graph to a
component ID. The component ID is the smallest vertex ID of any
vertex in the component.

**Related commands:**

cc_stats

**(Cohen)** Cohen, "Graph Twiddling in a MapReduce World", Computing in
Science and Engineering, 11, 29-41 (2009).

**(Plimpton)** Plimpton and Devine, "MapReduce in MPI for Large-Scale
Graph Algorithms", to appear in Parallel Computing (2011).