From Documentation
Revision as of 10:22, 6 June 2019 by Edward (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
This page is scheduled for deletion because it is either redundant with information available on the CC wiki, or the software is no longer supported.

Contents

Overview

Hyper-Q is a new hadrware/software feature of NVIDIA GPUs. It is available in GPUs with CUDA capability 3.5 and higher. In SHARCNET, this feature is available on mosaic cluster (K20 GPUs), copper cluster (K80 GPUs) and on some vis stations. It is also available on the P100 GPUs in the new Compute Canada clusters cedar and graham.

According to NVIDIA,

Hyper-Q enables multiple CPU cores to launch work on a single GPU
simultaneously, thereby dramatically increasing GPU utilization and significantly reducing CPU
idle times. Hyper-Q increases the total number of connections (work queues) between the host
and the GK110 GPU by allowing 32 simultaneous, hardware-managed connections (compared to
the single connection available with Fermi). Hyper-Q is a flexible solution that allows separate
connections from multiple CUDA streams, from multiple Message Passing Interface (MPI)
processes, or even from multiple threads within a process. Applications that previously
encountered false serialization across tasks, thereby limiting achieved GPU utilization, can see
up to dramatic performance increase without changing any existing code.

In our tests, Hyper-Q increases the total GPU flop rate even when the GPU is being shared by unrelated CPU processes ("GPU farming"). That means that Hyper-Q is great for CUDA codes with relatively small problem sizes, which on their own cannot efficiently saturate modern GPUs with thousands of cores (like K20).

Hyper-Q is not enabled by default, but it is straightforward to do. If you use the GPU interactively, execute the following commands before running your CUDA code(s):

export CUDA_VISIBLE_DEVICES=0
export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps
export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log
nvidia-cuda-mps-control -d

If you are using a scheduler, you should submit a script which contains the above lines, and then executes your code.

Then you can avail the Hyper-Q feature if you have more than one CPU thread accessing the GPU. This will happen if you run an MPI/CUDA, OpenMP/CUDA code, or multiple instances of a serial CUDA code (GPU farming).

Many additional details on Hyper-Q can be found in this document: Multi Process Service (MPS) - NVIDIA Documentation.