Lecture 23
Learning objectives
After this class, you should be able to:
- Given an algorithm or code, identify possibility of performance penalty due thread divergence and change the algorithm or code to reduce its performance impact.
- Describe how the arithmetic on gpu.cs.fsu.edu differs from the IEEE standard, and compare the level of support with that on the PS3 and with SSE.
Reading assignment
- GPU-6 on Blackboard, under the "course library" tab.
Exercises and review questions
- Exercises and review questions on current lecture's material
- Write CUDA code with thread divergence and check PTX code to see if predicated execution is used.
- Compare the difference in results of some mathematical operations using fast math on CUDA, the PS3, and a conventional processor, using single and double precision on the latter. Post your answers on the discussion board.
- Preparation for the next lecture
- None. Work on optimizing the code for your group project.
Last modified: 21 Apr 2010