Lecture 7
Learning objectives
After this class, you should be able to:
- Program the PS3, using the following functions:
spe_image_open
,spe_context_create
,spe_program_load
,spe_context_run
,spe_out_mbox_status
,spe_out_mbox_read
,spe_in_mbox_status
,spe_in_mbox_write
,spe_ls_area_get
,spe_context_destroy
,spe_image_close
,mfc_get
,mfc_put
,mfc_write_tag_mask
,mfc_read_tag_status_all
,spu_write_out_mbox
, andspu_read_in_mbox
.- Give the DMA alignment and size restriction in
mfc_get
andmfc_put
calls.- Give peak floating point performance of the SPEs, the maximum memory bandwidth, the total EIB bandwidth, and the bandwidth to SPEs.
- Explain the need for the
volatile
qualifier on data variables used in DMA transfers.
Reading assignment
- Read Section 4.1.2 of the Cell Redbook (on Blackboard -- course library).
- References
- Libspe2(on Blackboard -- course library).
- SPE Extensions (on Blackboard -- course library).
Exercises and review questions
- Exercises and review questions on current lecture's material
- In Example 3 of Lecture 7, each SPE reads a portion of an array from main memory and computes the square of each element. If the data size is very large, then it will not fit in cache. In that case, you will need to bring pieces of data to the SPE and compute their squares. Write a code that does this, with and without overlap of computation and DMA transfers. What is the difference in performance? Report your results on the discussion board, under the
Lecture 7
thread.- Preparation for the next lecture
- Time some piece of code on an SPE using the
spu_read_decrementer
and thespu_write_decrementer
functions. Report your results on the discussion board, under theLecture 7
thread.
Last modified: 1 Feb 2010