From Documentation
Jump to: navigation, search

Deep learning algorithms often do not require high numerical precision to produce satisfactory results, and thus benefit from performance gains that using reduced precision can provide. To take advantage of this, recent generations of NVIDIA GPUs have been increasing support for reduced precision operations acting on FP16, INT8, INT4 and BOOL variables. This seminar will describe how to use such operations in CUDA code.