Recall from Section 3.7.2 that a hypercube connects
each of P
tasks ( P
a power of 2) to other tasks
(Figure 3.16). The template considered in this
chapter uses this communication structure in an SPMD fashion, with
each task executing Algorithm 11.1. A local state
variable is first set to be the supplied input data. Computation then
proceeds in
steps. In each step, each task first exchanges
its local state with one of its neighbors in the hypercube and
then combines the message received from the neighbor with
state to generate a new state. The output of the computation
is the state generated in the final step.
In Algorithm 11.1, the XOR function denotes an
exclusive or operation and is used to identify neighbors. (Exclusive
or is defined as follows: 0 XOR 0=0, 0 XOR 1=1, 1 XOR
0=1, 1 XOR 1=0.) As noted in Section 3.7.2, the
hypercube has the property that the binary labels of two nodes that
are neighbors in the
d
th dimension differ only in the d
th place; hence, the
expression myid XOR yields the i
th neighbor of node
myid.
A particular parallel algorithm is defined by the operator OP used to combine state and message at each step in the template. In the following, we shall show how this template can be used as a basis for parallel vector reduction, matrix transposition, and sorting algorithms.
© Copyright 1995 by Ian Foster