Learning objectives and review

Lecture 20

Learning objectives

After this class, you should be able to:

Explain stream scheduling in different generations of Nvidia GPUs.
Use the above knowledge to improve performance (i) when overlapping computation and data movement and (ii) when trying to execute kernels concurrently.

Reading assignment

UIUC Lecture 21b.
Read the Nvidia webinar: StreamsAndConcurrencyWebinar.pdf.
Look up internet resources to learn about timinng using CUDA Events.

Exercises and review questions

Exercises and review questions on current lecture's material

Analyze your assignment 3 code that used double and triple buferring for potential bottlenecks based on Fermi's queues. Make changes in your code to improve performance. Report the old and new performance results on Blackboard.

Preparation for the next lecture

Modify one of your earlier timing codes to use CUDA Events to perform timing. Is the time obtained close to your previous timing?

Last modified: 26 Mar 2013