From Documentation
Jump to: navigation, search
(Controlling Program Execution)
 
(24 intermediate revisions by 4 users not shown)
Line 6: Line 6:
 
with a graphical user interface (GUI).  It’s main intended use is for
 
with a graphical user interface (GUI).  It’s main intended use is for
 
debugging parallel MPI codes, though it can also be used with serial, threaded (OpenMP / pthreads), and GPU (CUDA; also mixed MPI and CUDA) programs.  The product was
 
debugging parallel MPI codes, though it can also be used with serial, threaded (OpenMP / pthreads), and GPU (CUDA; also mixed MPI and CUDA) programs.  The product was
developed by Allinea (U.K.). It is installed on many SHARCNET
+
developed by Allinea (U.K.). It is installed on Graham cluster. The DDT software page can be
clusters. The DDT software page can be
+
 
found [https://www.sharcnet.ca/my/software/show/36 here].
 
found [https://www.sharcnet.ca/my/software/show/36 here].
  
 
DDT supports C, C++, and Fortran 77 / 90 / 95 / 2003.  Detailed
 
DDT supports C, C++, and Fortran 77 / 90 / 95 / 2003.  Detailed
documentation (the User Guide) is available as a PDF file on clusters
+
documentation (the User Guide) is available as a PDF file on Graham, in
where DDT is installed (currently these are orca, angel, kraken, and requin), in
+
  
  /opt/sharcnet/ddt/*/doc
+
  /cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/allinea/*/doc/
  
 
==Preparing your program==
 
==Preparing your program==
  
The code has to be compiled with the switch –g, which tells the
+
The code has to be compiled with the switch -g, which tells the
 
compiler to generate symbolic information required by any
 
compiler to generate symbolic information required by any
 
debugger. Normally, all optimizations have to be turned off. For
 
debugger. Normally, all optimizations have to be turned off. For
 
example,
 
example,
  
  f77 –g –o program program.f
+
  f77 -g -o program program.f
  
  mpicc –g –o code code.c
+
  mpicc -g -o code code.c
  
For CUDA code (on angel), you can first compile the CUDA part (*.cu files) using nvcc,
+
For CUDA code, you can first compile the CUDA part (*.cu files) using nvcc,
  
 
  nvcc -g -G -c cuda_code.cu
 
  nvcc -g -G -c cuda_code.cu
Line 38: Line 36:
  
 
  mpicc -g main.c cuda_code.o -lcudart
 
  mpicc -g main.c cuda_code.o -lcudart
 
  
 
==Launching DDT==
 
==Launching DDT==
Simply type  
+
 
 +
DDT is a GUI application, so one has to set up the environment properly to run X-windows (graphical) applications on Graham:
 +
 
 +
** Microsoft Windows: use free mobaxterm app
 +
** Linux / Unix: nothing to install
 +
** Mac: install a free app XQuartz
 +
 
 +
In all cases, add "-Y" to all your ssh commands, for X11 tunelling.
 +
 
 +
DDT is an interactive GUI application, so before using it one has to first allocate a compute node (or nodes) with salloc, e.g.
 +
 
 +
salloc -A account_name --x11 --time=0-3:00 --mem-per-cpu=4G --ntasks=4  (for CPU codes)
 +
salloc -A account_name --x11 --time=0-3:00 --mem-per-cpu=4G --ntasks=1 --gres=gpu:1  (for GPU codes)
 +
 
 +
Once the resource becomes available, load the corresponding DDT module:
 +
 
 +
module load allinea-cpu  , or
 +
module load allinea-gpu
 +
 
 +
For MPI codes, one also has to execute the following command:
 +
 
 +
export OMPI_MCA_pml=ob1
 +
 
 +
or else the Message queue display feature will not work.
 +
 
 +
To debug a code, simply type  
  
 
  ddt program [optional arguments]
 
  ddt program [optional arguments]
Line 78: Line 100:
  
 
*Process Control And Process Groups:
 
*Process Control And Process Groups:
 
 
**Ability to group many process together, with a one-click access to the whole group
 
**Ability to group many process together, with a one-click access to the whole group
**Three predefined groups: All, Root, Workers.
+
**Three predefined groups: All, Root, Workers. (Newest DDT versions only have one group - All).
 
**Groups can be created, deleted, or modified by the user at any time
 
**Groups can be created, deleted, or modified by the user at any time
  
Line 86: Line 107:
  
 
*Starting, Stopping And Restarting A Program
 
*Starting, Stopping And Restarting A Program
 
 
**Session Control dialog
 
**Session Control dialog
 
*Stepping Through A Program
 
*Stepping Through A Program
Line 102: Line 122:
 
*Synchronizing Processes in a Group
 
*Synchronizing Processes in a Group
 
**"Run to here" command (right mouse click)
 
**"Run to here" command (right mouse click)
*Can work with individual processes (by changing “Focus”)
+
*Can work with individual processes or threads (by changing “Focus”)
 
*Setting A Watch (for the current process)
 
*Setting A Watch (for the current process)
 
*Working with stacks
 
*Working with stacks
Line 170: Line 190:
 
==Message queues==
 
==Message queues==
  
*View -> Message Queues in the control panel.
+
*View -> Message Queues (older versions), or Tools -> Message Queues in newer versions, in the control panel.
 
*Produces both a graphical view and a table for all active communications.
 
*Produces both a graphical view and a table for all active communications.
 
*Helps to debug such MPI problems as deadlocks etc.
 
*Helps to debug such MPI problems as deadlocks etc.
Line 195: Line 215:
 
**Useful for checking function arguments
 
**Useful for checking function arguments
 
*Detecting memory leaks
 
*Detecting memory leaks
**"View->Current Memory Usage" window
+
**"View->Current Memory Usage" window (Tools->Current Memory Usage in newer versions)
 
**Shows current memory usage across processes in the group
 
**Shows current memory usage across processes in the group
 
**Click on a color bar to get allocation details
 
**Click on a color bar to get allocation details
Line 206: Line 226:
  
 
*Memory Statistics
 
*Memory Statistics
**Menu option "View->Overall Memory Stats"
+
**Menu option "View->Overall Memory Stats" ("Tools->Overall Memory Stats" in newer versions).
  
 
[[File:ddt_wiki18.png]]
 
[[File:ddt_wiki18.png]]
Line 223: Line 243:
 
**Setting thread-specific breakpoints
 
**Setting thread-specific breakpoints
 
**Compare expressions across threads, etc
 
**Compare expressions across threads, etc
*But:
 
**Only one group – all threads.
 
 
=Using DDT=
 
 
==Running DDT in SHARCNET==
 
 
*The environment has to be set properly to run X-windows (graphical) applications on our clusters.
 
**Microsoft Windows: X-win (commercial program),  cygwin (not easy to set up) or Xming.
 
**Linux / Unix: add “ForwardX11 yes” – in    
 
 
/etc/ssh/ssh_config, or
 
~/.ssh/config
 
 
**To test, ssh to a cluster and type xterm
 
 
*On SHARCNET clusters, DDT is configured to use the test queue
 
**Almost immediate allocation
 
**One hour limit per debugging session
 
**Other jobs may be suspended
 
  
==OpenMP specifics==
+
==Debugging GPU programs==
  
*Currently the GUI is optimized for MPI and serial codes, but not for OpenMP.
+
*Compilation instructions were [[#Preparing your program|given earlier]].
*To make life easier, I’ve created two DDT configuration files, config.ddt (it is created automatically in ~/.ddt when you run DDT for the first time).
+
*Can be a mixed MPI/CUDA code, but can only use one CPU and one GPU per node, up to 8 nodes.
**To debug an OpenMP code, first do
+
  
cp ~syam/OMP/config.ddt ~/.ddt
+
[[File:ddt_wiki21.jpg]]
  
**To switch back to MPI, exit DDT, and run
 
  
cp ~syam/MPI/config.ddt ~/.ddt
+
[[Category:Tutorials]]
 +
<!--checked2016-->

Latest revision as of 15:42, 9 January 2019

Overview of DDT

Introduction

The Distributed Debugging Tool (DDT) is a powerful commercial debugger with a graphical user interface (GUI). It’s main intended use is for debugging parallel MPI codes, though it can also be used with serial, threaded (OpenMP / pthreads), and GPU (CUDA; also mixed MPI and CUDA) programs. The product was developed by Allinea (U.K.). It is installed on Graham cluster. The DDT software page can be found here.

DDT supports C, C++, and Fortran 77 / 90 / 95 / 2003. Detailed documentation (the User Guide) is available as a PDF file on Graham, in

/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/allinea/*/doc/

Preparing your program

The code has to be compiled with the switch -g, which tells the compiler to generate symbolic information required by any debugger. Normally, all optimizations have to be turned off. For example,

f77 -g -o program program.f
mpicc -g -o code code.c

For CUDA code, you can first compile the CUDA part (*.cu files) using nvcc,

nvcc -g -G -c cuda_code.cu

and then link with the non-CUDA part of the code, using cc for serial

cc -g main.c cuda_code.o -lcudart

or mpicc for mixed MPI/CUDA code:

mpicc -g main.c cuda_code.o -lcudart

Launching DDT

DDT is a GUI application, so one has to set up the environment properly to run X-windows (graphical) applications on Graham:

    • Microsoft Windows: use free mobaxterm app
    • Linux / Unix: nothing to install
    • Mac: install a free app XQuartz

In all cases, add "-Y" to all your ssh commands, for X11 tunelling.

DDT is an interactive GUI application, so before using it one has to first allocate a compute node (or nodes) with salloc, e.g.

salloc -A account_name --x11 --time=0-3:00 --mem-per-cpu=4G --ntasks=4  (for CPU codes)
salloc -A account_name --x11 --time=0-3:00 --mem-per-cpu=4G --ntasks=1 --gres=gpu:1  (for GPU codes)

Once the resource becomes available, load the corresponding DDT module:

module load allinea-cpu  , or
module load allinea-gpu

For MPI codes, one also has to execute the following command:

export OMPI_MCA_pml=ob1

or else the Message queue display feature will not work.

To debug a code, simply type

ddt program [optional arguments]

After a couple of seconds, you will be presented with the following window:

Ddt wiki1.png

User interface

DDT uses a tabbed-document interface. Each component of DDT is a dockable window, which may be dragged around by a handle. Components can also be double-clicked, or dragged outside of DDT, to form a new window. Some of the user-modified parameters and windows are saved by right-clicking and selecting a save option in the corresponding window (Groups; Evaluations; Breakpoints etc).

DDT also has the ability to load and save all these options concurrently to minimize the inconvenience in restarting sessions. Saving the session stores such things as process groups, the contents of the Evaluate window and more. When DDT begins a session, source code is automatically found from the information compiled in the executable.

The "Find" and "Find In Files" dialogs are found from the "Search" menu. The "Find" dialog will find occurrences of an expression in the currently visible source file. The "Find In Files" dialog searches all source and header files associated with your program and lists the matches in a result box. Click on a match to display the file in the main Source Code window and highlight the matching line.

DDT has a "Goto line" function which enables the user to go directly to a line of code (in the "Search" menu, or Ctrl-G).

Controlling Program Execution

  • Process Control And Process Groups:
    • Ability to group many process together, with a one-click access to the whole group
    • Three predefined groups: All, Root, Workers. (Newest DDT versions only have one group - All).
    • Groups can be created, deleted, or modified by the user at any time

Ddt wiki2.png

  • Starting, Stopping And Restarting A Program
    • Session Control dialog
  • Stepping Through A Program
    • Play/Continue, Pause, Step Into, Step Over, Step Out
  • Setting Breakpoints
    • All breakpoints are listed under the breakpoints tab.
    • You can suspend, jump to, delete, load or save breakpoints.
    • Breakpoints can easily be made conditional:

Ddt wiki3.png


Ddt wiki4.png

  • Synchronizing Processes in a Group
    • "Run to here" command (right mouse click)
  • Can work with individual processes or threads (by changing “Focus”)
  • Setting A Watch (for the current process)
  • Working with stacks
    • Moving (Down / Up / Bottom Stack Frame)
    • Aligning of stacks for parallel codes (Ctrl-A)
  • Viewing Stacks in Parallel
    • Stacks tab.
    • Shows a tree of functions, merged from every process in the group
    • Click on any branch to see its location in the Source Code viewer
    • Hover the mouse to see the list of the process ranks at this location in the code
    • Can automatically gather the processes at a function together in a new group

Ddt wiki5.png

  • Browsing Source Code
    • Highlights the current source line
    • Different color coding for synchronous and asynchronous state in the group
  • Simultaneously Viewing Multiple Files
    • Right click to split the source window into two or more, with each one having its own set of tabs.
  • Signal Handling: will stop on the following signals:
    • SIGSEGV (segmentation fault)
    • SIGFPE (Floating Point Exception)
    • SIGPIPE (Broken Pipe)
    • SIGILL (Illegal Instruction)
  • To enable FPE handling, extra compiler options are required:
    • Intel: -fp0

Variables and data

  • Current Line tab
    • Viewing all the variable for the current line(s) (click and drag for multiple lines)
  • Locals tab
    • Shows all local variables for the process

Ddt wiki6.png

  • Evaluate window
    • Can be used to view values for arbitrary expressions and global variables
  • Support for Fortran modules
    • Fortran Modules tab in the Project Navigator window
  • Changing Data Values
    • Right-click and select "Edit value" (in Evaluate window)
  • Examining Pointers
    • Drag a pointer into the Evaluate window
    • Can be viewed as Vector, Reference, or Dereference (right-click to choose).
  • Multi-Dimensional Arrays
    • Can be viewed in the Variable View
    • Multi-Dimensional Array viewer – visualization of a 2-D slice of an array using OpenGL

Ddt wiki7.png


Ddt wiki8.png

  • Cross-Process Comparison
    • Right-click on a variable name, or
    • From View menu: Cross-Process/Thread Comparison (type in any valid expression)
    • Three modes – Compare, Statistics, and Visualize

Ddt wiki9.png


Ddt wiki10.png


Ddt wiki13.png

Message queues

  • View -> Message Queues (older versions), or Tools -> Message Queues in newer versions, in the control panel.
  • Produces both a graphical view and a table for all active communications.
  • Helps to debug such MPI problems as deadlocks etc.

Ddt wiki11.png


Ddt wiki12.png

Memory Debugging

  • Can intercept memory allocation and de-allocation calls and perform lots of complex heap- and bounds- checking.
  • Off by default, can be turned on before starting a debugging session (in Advanced settings).
  • Different levels of memory debugging – from minimal to high.

Ddt wiki14.png

  • On error, stops with a message like

Ddt wiki15.png

  • Check Validity
    • In Evaluate window, right-click to "Check pointer is valid"
    • Useful for checking function arguments
  • Detecting memory leaks
    • "View->Current Memory Usage" window (Tools->Current Memory Usage in newer versions)
    • Shows current memory usage across processes in the group
    • Click on a color bar to get allocation details
    • For more details, choose "Table View"

Ddt wiki16.png


Ddt wiki17.png

  • Memory Statistics
    • Menu option "View->Overall Memory Stats" ("Tools->Overall Memory Stats" in newer versions).

Ddt wiki18.png


Ddt wiki19.png


Ddt wiki20.png

Debugging OpenMP programs

  • The current version of DDT has almost the same OpenMP functionality as for MPI:
    • Single click access to threads
    • Viewing stacks in parallel
    • Setting thread-specific breakpoints
    • Compare expressions across threads, etc

Debugging GPU programs

  • Compilation instructions were given earlier.
  • Can be a mixed MPI/CUDA code, but can only use one CPU and one GPU per node, up to 8 nodes.

Ddt wiki21.jpg