|Target usage: GPU jobs|
|System information: see monk system page in web portal|
|System status: see monk status page|
|Real time system data: see Ganglia monitoring page|
|Full list of SHARCNET systems|
Monk is a SHARCNET's GPGPU cluster.
- Number of Cores: 432
- Number of Nodes: 54
- Interconnect: QDR InfiniBand
- Cores per node: 8
- GPUs per node: 2 Tesla M2070
- Fermi, 448 cores, 1.15 Ghz, 6 GB memory, 144 GB/s memory bandwidth, compute 2.0
- Memory per node: 48 GB
- CPU: Intel E5607 4 cores @ 2.26 GHz
Optimizing for Fermi architecture
Monk uses Fermi architecture GPUs with compute capability 2.0. To ensure your code is optimized correctly, consult the following documentation:
The key thing to remember it to always compile with the -arch=sm_20 flag if optimized code is needed.
For instructions on submitting GPU jobs please see SHARCNET's GPU Accelerated Computing documentation
Please note that interactive GPU jobs do not work. The cluster is optimized for batch usage, please use the development node below for interactive work.
Monk has a single GPU-equipped node dedicated to development work. This node does not run user jobs and is similar to a login node, with the exception being that it is only accessible within monk. You may log into this node via ssh to do work interactively. The GPUs are set in shared compute mode, which means that multiple users can access the GPUs at the same time.
Once you have logged into monk, you can access the development node by doing ssh mon-devel1 or ssh monk-devel1 (this is actually node 54 at present) . Once inside the node, you can just run your executable directly from the command line; there is no need to use sqsub. If your executable is called test.x, you would change into your working directory and execute:
If the job is somewhat longer and you don't want to keep a terminal open while it runs, it is possible to run it in the background in such a way that it will not terminate even when you end your terminal session. To do this, you would do:
nohup ./test.x > test_output.out &
If you then type
to log out, your job will continue running to completion, so you can log in later when it's done and examine the output in the test_output.out file.
Keep in mind that the current cpu time limit for processes running on the development node is 12 hours, though most of the processes running on this node should be much shorter than that as it is meant for development and not production runs.
Users must be aware of what other people are doing on the node and avoid over-requesting resources, primarily memory, as it will impact all users on the system. Use the free and uptime commands to see how busy the development node is.
To make your compile match the architecture of monk, please remember to always use nvcc with flag -arch=sm_20 which ensure the executable is targeted at GPU cards with Comput Capability 2.0 installed in monk. Without this flag, a more generic, less efficient executable is generated.