The Communicating Sequential Elements Pattern
of http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Ortega-Arjona/Ortega-Arjona.pdf. It's a snapshot of the page taken as our search engine crawled the Web.
The web site itself may have changed. You can check the current page or check for previous versions at the Internet Archive.
Yahoo! is not affiliated with the authors of this page or responsible for its content.
The Communicating Sequential Elements Pattern
The Communicating Sequential Elements Pattern
A Domain Parallelism Architectural Pattern for Parallel Programming
Jorge L. OrtegaArjona
Department of Computer Science
University College London
Gower Street, London WC1E 6BT, U.K.
jortegaarjona@acm.org
Abstract
The Communicating Sequential Elements pattern is an architectural pattern for parallel programming,
used when a design problem can be understood in terms of domain parallelism. This pattern proposes a
solution in which the same operations are performed simultaneously on different pieces of regular data,
and operations in each component depend on partial results of neighbour components. Usually, this
pattern is presented as a network or logical structure, conceived from the regular data.
1. Introduction
Parallel processing is the division of a problem, presented as a data structure or a set of actions, among
multiple processing components that operate simultaneously. The expected result is a more efficient
completion of the solution to the problem. The main advantage of parallel processing is its ability to
handle tasks of a scale that would be unrealistic or not costeffective for other systems [CG88, Fos94,
ST96, Pan96]. The power of parallelism centres on partitioning a big problem in order to deal with
complexity. Partitioning is necessary to divide such a big problem into smaller subproblems that are
more easily understood, and may be worked on separately, on a more "comfortable" level. Partitioning
is especially important for parallel processing, because it enables software components to be not only
created separately but also executed simultaneously.
Requirements of order of data and operations dictate the way in which a parallel computation has to be
performed, and therefore, impact on the software design [OR98]. Depending on how the order of data
and operations are present in the problem description, it is possible to consider that most parallel
applications fall into one of three forms of parallelism: functional parallelism, domain parallelism, and
activity parallelism [OR98]. Examples of each form of parallelism are pipeline processing [VTB95],
representing functional parallelism; communicating sequential elements, as an example of domain
parallelism; and masterslave [POSA96], which is an instance of activity parallelism.
2. The Communicating Sequential Elements Pattern
The Communicating Sequential Elements pattern is a domain parallelism pattern in which each
component performs the same operations on different pieces of regular data. Operations in each
component depend on partial results in neighbour components. Usually, this pattern is conceived as a
logical structure, reflecting the particular order present in the problem [OR98].
Copyright
©
2000 Jorge Luis OrtegaArjona. Permission is granted to copy for the PLoP 2000 conference.
All other rights reserved.
Domain parallelism is the form of parallelism that involves problems in which a set of operations is to
be performed on a data structure that exhibits a specific order. Each component in the solution executes
a semiautonomous computation influenced by the computations on its neighbours. The amount of
communication between all components can be variable, following fixed and predictable paths that can
be represented as a network. In this form of parallelism, it is difficult to conceive the computation as a
flow of data among processing stages or sequential steps in an algorithm [CG88, Fos94, OR98,
Pan96].
Example: the Heat Equation
Heat is a level of energy present in any physical body, perceptible by its particular temperature.
However, even though we can measure an average temperature, in general heat is not evenly
distributed throughout all the body. Observing more carefully, it is noticeable that in different parts of
the body it is possible to find different temperatures, and hence, different levels of heat. Moreover,
these different temperatures vary through time, tending to increase or decrease depending on the
interchange of heat between parts of the body. Thus, in a body, different parts expose a different
temperature, determining a particular distribution at different times.
In physical and engineering areas, this distribution of heat is particularly important to determine
particular thermal properties of materials. The main objective is to obtain a proper representation of the
overall effect that allows scientists and engineers to analyse such thermal properties in a more time
efficient way. However, the difficulty of this problem resides in the time to operate on a large number
of data pieces and the number of operations per data piece.
For example, let us consider the simplest case, in which the Heat Equation is used to model the heat
distribution on a onedimensional body, a thin substrate, such as a wire, divided in n segments
representing different temperatures (Figure 1).
Figure 1.
An example of a wire divided in n segments with different temperatures.
The heat diffusion is modelled using a function representing temperature variations depending on time
and position in the body. This function is obtained as the solution of a differential equation, known as
the Heat Equation [GBDJMS94]. For our example, a function A(t,x) represents the heat diffusion
through the wire. A simple method developed for deriving a numerical solution to the Heat Equation is
the method of finite differences. The finite differences method cuts the length of the wire into equal
parts of length
x, and divides the time in discrete segments of length
t. Approximating the
continuous Heat Equation by its values at the end points of the subsegments at the discrete time
points 0,
t, 2
t
,..., the discrete form for obtaining the heat distribution at the following time step is:
))
1
,
(
)
,
(
2
)
1
,
(
(
)
,
(
)
,
1
(
2
+
+
+
=
+
j
i
A
j
i
A
j
i
A
x
t
j
i
A
j
i
A
where i represents time steps, and j represents the position of segments in the wire.
Wire
Segment 1
(Temperature 1)
Segment 2
(Temperature 2)
Segment n
(Temperature n)
The initial and boundary conditions needed to solve the difference equation numerically are:
1
0
)
sin(
)
,
0
(
1
)
1
,
(
,
0
)
0
,
(
=
=
=
x
for
x
x
A
t
t
A
t
A
The numerical solution is now computed simply by calculating the value for each segment j at a given
time step i, considering the temperature from both its previous and its next segments. The total time
required to execute this numerical solution sequentially depends directly on the number of segments
and the number of time steps needed to describe the heat distribution through time. The larger number
of segments and number of time steps, the longer it takes to compute the solution. A sequential
approach that obtains a single temperature value for each segment at each time step is not the most
timeefficient way to compute the heat diffusion. However, we can potentially carry out this
computation more efficiently by
1. Using a group of parallel components that exploit a onedimensional logical structure representing
the wire, and
2.
Calculating simultaneously at a given time step the value of A(i+1, j) for all segments.
Context
Start the design of a software program for a parallel system, using a particular programming
language for certain parallel hardware. Consider the following context constraints:
The problem involves tasks of a scale that would be unrealistic or not costeffective for other
systems to handle and lends itself to be solved using parallelism. Consider the Heat Equation
example: suppose that it is needed to obtain the temperature values for a wire divided into 1,000
segments, considering time steps of 5 milliseconds, during a time frame of 10 seconds. The total
number of operations required is 2,000,000.
The hardware platform or machine to be used is given, offering a reasonably good fit to the
parallelism found in the problem.
The main objective is to execute the tasks in the most timeefficient way.
Problem
A parallel computation is required that can be performed as a set of operations on regular data.
Results cannot be constrained to a oneway flow among processing stages, but each component
executes its operations influenced by data values from its neighbouring components. Because of this,
components are expected to intermittently exchange data. Communications between components
follow fixed and predictable paths.
For instance, consider the Heat Equation example. A onedimensional body, a wire, can be
represented as a data structure, in which the temperature of an segment influences the temperature on
adjacent segments and to a different extent, those on either side. Over time, the effects propagate to
other segments extending in both directions; even the source segment may experience reverberations
due to temperature changes from neigh