Lecture 2
Learning objectives
After this class, you should be able to:
- Describe the following architectural features: pipelining, super-scalar execution, out of order execution, speculative execution.
- Describe the difference between the following types of parallelism: instruction level parallelism, data level parallelism, and thread level parallelism.
- Describe the four categories in Flynn's classification.
- Explain the difference between shared memory and shared network computers.
- Explain the cache coherence problem and two ways of handling it.
- Explain the differences in the approaches conventional multicore processors, GPUs, and the Cell processor use to deal with high data transfer costs.
Reading assignment
- Review material on computer architecture on topics related to those mentioned under the learning objectives.
- Read Chapter 1 of Kirk and Hwu's GPU book (GPU-1 on Blackboard -- course library).
- Review
pthreads
material from your Operating Systems course.Exercises and review questions
- Exercises and review questions on current lecture's material
- Write two programs having the same number of floating point operations, but one which is easy to pipeline and another which is not. Compare the difference in performance in terms of the number of floating point operations per second.
- Write two programs having the same number of cache misses, but one where the cache access pattern is easy to predict and another where it is not. Compare the difference in time take for the two codes.
- Write a simple matrix multiplication code. What is the performance in terms of Gflop/s?
- Write a simple matrix vector multiplication code. What is the performance in terms of Gflop/s?
- Preparation for the next lecture
- Write a program that will create four pthreads, and each thread will output
Hello World from n
, where the value ofn
should be distinct for each thread.
Last modified: 11 Jan 2010