This webinar is an extension of the GPU programming course taught in Compute Ontario Summer School. In this webinar, Nvidia Nsight (including Nvidia Visual Profiler) will be introduced in detail with simple example code running on SHARCNET’s GPU clusters. By using Nsight, you will be able to find out how efficient the kernel code is and what the potential bottle-necks are. Optimization tricks like Instruction-Level Parallelism(ILP) and SHUFFLE instructions will be used to obtain better performance results. Hardware background knowledge will also be covered to help give a better understanding of the instruction latency, occupancy, hardware utilization, memory bandwidth, etc.