Lecture 7
Learning objectives
After this class, you should be able to:
- Describe the following CUDA features:
cudaStreamCreate
,cudaStreamDestroy
,cudaMemcpyAsync
, and calling kernels with a stream parameter.- Use the above features to hide data transfer overhead through multiple bufferring.
- Write code that uses memory coalescing to reduce the data transfer overhead in accessing DRAM.
Reading assignment
- Read the Lecture 7, UIUC Lecture 6-7 slides (until slide 15), UIUC Lecture 7 slides.
- Chapter 6 of text.
Exercises and review questions
- Exercises and review questions on current lecture's material
- Modify the code shown in class so that it uses triple bufferring to reduce the data transfer overhead. Give your performance results with this, compared to using no data transfer hiding, on the discussion forum on blackboard.
- Compare the performance of copying matrix data from DRAM to shared memory using the two options discussed in class. Present your results on the discussion forum on blackboard.
- Preparation for the next lecture
- None.
Last modified: 29 Jan 2013