[Fluent Inc. Logo] return to home search
next up previous contents index

31.6.1 Checking Parallel Performance

The performance meter allows you to report the wall clock time elapsed during a computation, as well as message-passing statistics. Since the performance meter is always activated, you can access the statistics by printing them after the computation is completed. To view the current statistics, use the Parallel/Timer/Usage menu item.

Parallel $\rightarrow$ Timer $\rightarrow$ Usage

Performance statistics will be printed in the text window (console).

To clear the performance meter so that you can eliminate past statistics from the future report, use the Parallel/Timer/Reset menu item.

Parallel $\rightarrow$ Timer $\rightarrow$ Reset

The following example demonstrates how the current parallel statistics are displayed in the console window:

Performance Timer for 1 iterations on 4 compute nodes
 Average wall-clock time per iteration:              4.901 sec
 Global reductions per iteration:                      408 ops
 Global reductions time per iteration:               0.000 sec (0.0%)
 Message count per iteration:                          801 messages
 Data transfer per iteration:                        9.585 MB
 LE solves per iteration:                               12 solves
 LE wall-clock time per iteration:                   2.445 sec (49.9%)
 LE global solves per iteration:                        27 solves
 LE global wall-clock time per iteration:            0.246 sec (5.0%)
 AMG cycles per iteration:                              64 cycles
 Relaxation sweeps per iteration:                     4160 sweeps
 Relaxation exchanges per iteration:                   920 exchanges

 Total wall-clock time:                              4.901 sec
 Total CPU time:                                    17.030 sec

A description of the parallel statistics is as follows:

The most relevant quantity is the Total wall clock time. This quantity can be used to gauge the parallel performance (speedup and efficiency) by comparing this quantity to that from the serial analysis (the command line should contain -t1 in order to obtain the statistics from a serial analysis). In lieu of a serial analysis, an approximation of parallel speedup may be found in the ratio of Total CPU time to Total wall clock time.


next up previous contents index Previous: 31.6 Checking and Improving
Up: 31.6 Checking and Improving
Next: 31.6.2 Optimizing the Parallel
© Fluent Inc. 2006-09-20