From Documentation
Jump to: navigation, search


Overview

There are no known parallel models to date designed to study impacts of climate change on vegetation patterns using mechanistic models that couple stochastic, individual-level interactions to community-level and regional-level dynamics. High-performance computing (HPC) enabled development of whole-earth climate-change models. Developing such large-scale models may transform ecological modelling of vegetation patterns in an era of climate change. Outcomes will benefit fundamental ecological research as well as applied land management.

Results

The main goal of this project was to scale up the Anand labs' individual-based forest simulation code for large-scale modelling by implementing it in C and parallelizing it if required. During the project

  1. a general C push/pull based framework was created for implementing forest models,
  2. the FORSITE (the mathematical model) simulation code was re-implemented using the framework,
  3. a speedup between 2–15X was realized for fixed sized regions of interaction, and
  4. a speedup between 15–100X was realized for variable sized regions (not possible with original).

The creation of a framework is of particular note as it has application beyond the FORSITE model. An open issue is that a higher level interface is required as the C-level interface is too low for most ecology researchers.

Notation

  • The suffix s is added multiple times to indicate nestings of plurals (e.g., individualss would be a collection of a collection of individuals).
  • the prefix r is used to indicate the reduced versions of data (e.g., rindividualss would be a collection of a collection of individual reduction data)

System

The system consists of space, world, variety, and individual data:

space
the size of the world and what boundaries are periodic,
world
global user defined parameters (e.g., current simulation year, etc.),
variety
per variety user defined parameters (e.g., growth rate, etc.), and
individual
per individual user defined parameters (e.g., height, etc.).

The world, variety, and individual data is conceptually organized into a three level tree (i.e., one world, an arbitrary number of varieties for this world, and an arbitrary number of individuals for each of these varieties). The parts are maintained separately to minimize mutation operations. The world doesn't contain the varieties, nor do the varieties contain the individuals. Instead, there is a world, there is a collection of varieties, and there is a collection of a collection of individuals:

world
the one world
varieties
the collection (for the world) of variety
individualss
the collection (for the world) of collections (for the varieties) of individuals

Simulation

To start things, the space and world are initialized, an arbitrary number of varieties are created, and, for each variety, an arbitrary number of individuals are created. This gives a space and a (world, varieties, individualss) tuple. These two represent the simulation at any one moment in time. The simulation progresses by alternating between performing reduction and generation across the current (world, varieties, and individualss) tuple.

  1. The reduction generates the (rworld, rvarieties, rindividuals_in, rindividuals_out) tuple (e.g., highest individual in a variety, number of individuals in a variety, average height of individuals around each individual, etc.) from the current (world, varieties, individualss) tuple.
  2. The generation creates a new (world, varieties, individualss) tuple (e.g., growing an individual by replacing it with an identical but higher individual in the same spot, spawning new individuals of the same variety but smaller, etc.) from the current (world, varieties, individualss) and (rworld, rvarieties, rindividuals_in, rindividuals_out) tuples.

Reduction

Reduction is performed across across the (world, varieties, individualss) to extract the (rworld, rvarieties, rindividualss) parameters of interest.

rworld
reduction across all (world, variety, individual) tuples
rvarieties
collection of per variety reductions across all (world, variety, individual) tuples
rindividualss_in
collection of collections of per individual reductions across all (world, variety, individual) tuples where the later individual is in the the former individual's in range
rindividualss_out
collection of collections of per individual reductions across all (world, variety, individual) tuples where the former individual is in the the later individual's out range

The reductions are initialized by calling a user provided function with the data available at that level. Another user provided function is then called to update this data for each relevant (world, variety, individual) tuple below that level.

rworld

  • reduced across all (world, variety, individual) tuples at the world level (i.e., a single instance),
  • rworld is initialized by calling a user provided function with (world),
  • rworld is updated by repeatedly calling a user provided function with each (world, variety, individual) tuple, and
  • can be used to calculate desired world level statistics (e.g., highest individual in the simulation).

rvarieties

  • reduced across all (world, variety, individual) tuple at the variety level (i.e., a single instance for each variety),
  • each rvariety is initialized by calling a user provided function with the (world, variety) tuple for the variety,
  • each rvariety is updated by repeatedly calling a user provided function with each (world, variety, individual) tuple such that (world, variety) is the tuple for the variety, and
  • can be used to calculate desired variety level statistics across all individuals of that variety (e.g., highest individual for each variety).

rindividualss_in

  • reduced across all (world, variety, individual0, individual1) tuples at the individual level (i.e., a single instance for each individual),
  • each rindividual_in is initialized by calling a user provided function with the (world, variety, individual) tuple for the individual,
  • the in region for each individual is given by a user provided function passed the (world, variety, individual) tuple for the individual,
  • each rindividual_in is updated by repeatedly calling a user provided function with each (world, variety0, variety1, individual0, individual1) tuple such that (world, variety0, individual0) is the tuple for the individual and (world, variety1, individual1) is the tuple for the other individual in that individual's in range, and
  • can be used to calculate desired individual level statistics across all individuals encompassed by each individual's in range (e.g., tallest individual encompassed by the in range of each individual).

rindividualss_out

  • reduced across all (world, variety, individual0, individual1) tuples at the individual level (i.e., a single instance for each individual)
  • each rindividual_out is initialized by calling a user provided function with the (world, variety, individual) tuple for the individual,
  • the out region for each individual is given by a user provided function passed the (world, variety, individual) tuple for the individual,
  • each rindividual_out is updated by repeatedly calling a user provided function with each (world, variety0, variety1, individual0, individual1) tuple such that (world, variety0, individual0) is the tuple for the other individual whose out range the individual is in and (world, variety1, individual1) is the tuple for the individual, and
  • can be used to calculated desired individual level statistics across all individuals whose out range encompasses each individual (e.g., tallest individual whose out reach encompasses each individual).

Generation

The next step is generated from outside in. Each (individual) is mapped to a list of new individuals; each (variety, individuals) tuple is mapped to a list of new (variety, individuals); and the (world, varieties, individualss) tuple is mapped to a new (world, varieties, individualss) tuple.

For the individual and variety levels the corresponding component in the final (world, varieties, individualss) can be

  1. deleted by returning an empty list,
  2. left alone by returning a list with the a copy of the input,
  3. updated by returning a list with an updated copy of the input,
  4. replaced with an arbitrary number of new ones by returning a list of the replacement elements.

As there is only a single world returned, the world level only supports the second and third of these.

individual

  • each individual is mapped to a list of new individuals by calling a user defined function with the (world, variety, individual, rworld, rvariety, rindividual_in, rindividual_out) tuple for the individual,
  • for each variety, the combined collection of new individuals and from this mapping of current individuals becomes the set of individuals mapped with that variety at the variety level

variety

  • each variety and its associated individuals (produced by the above individual mappings) are mapped to a list of new varieties and associated individualss by calling a user defined function with the (world, variety, individuals, rworld, rvariety) tuple for the variety
  • the combined collection of new varieties and associated individualss from this mapping of current varieties and associated individualss become the set of varieties and associated individualss mapped with the world at the world level

world

  • the world and its associated varieties and individualss (produced above by the variety mappings) are mapped to a new world and associated varieties and individualss by calling a user defined function with the (world, varieties, individualss, rword) tuple
  • the new world and associated varieties and individualss from this mapping becomes the new (world, varieties, individualss) tuple

Complexity

The most complex part of the program is performing the individual to individual reductions efficiently. A double loop across all individuals (i.e., for each individual check every other individual to see it it falls in the desired region) would be \mathcal O(N^2), were N is the total number of individuals.[1]

To avoid this, the system maps the individuals to their location along a space filling 1D Z-curve and stores them in this sorted order. The mapping to a Z-curve is given by interleaving the binary digits of the coordinates. A key property of this mapping is that the Z-value of the upper-left and lower-right coordinates of a rectangular region bound the Z-values of all points within that region.

Combined with storing individuals in sorted order by their Z-value, this allows all the individuals in a rectangular region to be located in better than \mathcal O(N). The algorithm works by alternating between Z-values and individuals because not all Z-values between the lower and upper and bounds (the Z-values of the upper-left and lower-right corners of the region) fall within the region.

  • The first Z-value is the upper-left corner of the region (the Z-value of the upper-left corner of rectangular region is the the smallest Z-value within that region).
  • The first individual with a greater or equal Z-value is then located (this is \mathcal O(\mathrm{ln}(N)) due to the individuals being sorted by Z-value).
  • This located individual and all subsequent ones are processed until the first individual whose Z-value does not fall in the region is located.
  • If this individual's Z-value falls outside of the upper-bound (the Z-value of the lower-right corner) then all individuals have been located.
  • Otherwise the next point at which the Z-curve intersects the region is calculated and the process resumes with locating the first individual with a greater or equal Z-value.

The number of iterations in the above algorithm is a combination of the number of individuals in the region plus the number of times the algorithm switches between being inside and outside the the region. The later is bound by the number of times the Z-curve enters and exists the region. This, in turn, is bound by a multiple (a function of the precision of the Z-curve) of the region's boundary.

Assuming the number of individuals in a region is proportional to the area of the region, the former (the number of individuals in the region) will dominate the later (the number of times the Z-curve enters and exits the region). Assuming some upper bound on the size of all regions, it follows that finding all individuals in a given region is \mathcal O(\mathrm{ln}(N)).

The complexity of finding all individuals in the (globally bounded) regions about all individuals is then \mathcal O(N \mathrm{ln}(N)). The sorting of all individuals by their Z-values (required for the algorithm) is also \mathcal O(N \mathrm{ln}(N)). It follows that the overall complexity is \mathcal O(N \mathrm{ln}(N)).

Future

The current system can't handle overlapping in and out regions where the individuals themselves fall outside of both regions (i.e., individuals are located at a point within their regions so it is possible for the overlap of the regions to not include the individuals). The out version is also less desirable because it is a scatter operation. This forces simultaneous reducers to synchronize or maintain separate reduction space for later merger.

Replacing the individual_in and individual_out aggregation/reductions with just a single individual reduction to address these issues.

individual

  • reduced across all (world, variety, individual0, individual1) tuples at the individual level (i.e., a single instance for each individual)
  • each rindividual is initialized by calling a user provided function with the (world, variety, individual) tuple for the individual,
  • the region for each individual is given by a user provided function passed the (world, variety, individual) tuple the individual,
  • each rindividual is updated by repeatedly calling a user provided function with each (world, variety0, variety1, individual0, individual1) tuple such that (world, variety0, individual0) is the tuple for the individual and (world, variety1, individual1) is the tuple for the other individual whose range overlaps, and
  • can be used to calculated desired individual level statistics across all individuals whose ranges overlap with each individual (e.g., tallest individual whose reach overlaps).

The precision of the Z-curve plays a critical role in determining the constant factor in the overall \mathcal O(N \mathrm{ln}(N)) complexity. Less precise Z-values mean less stepping in and out of regions (and thus a smaller constant factor), but they also mean the Z-value filtering is more crude so more individuals are processed as being potentially in a region. Early profile results indicate that there could be significant performance gains to be made by reducing the current precision of the Z-values to obtain some optimal balance between these two effects.

  1. Saying f(x) is \mathcal O(g(x)) means \limsup_x f(x)/g(x) < \infty (i.e., f(x) is eventually bound by constant multiple of g(x)).