Due date: Tuesday, 18 Sep 2001
How to turn in: Turn in a hardcopy to me in class, at 2 pm.
Note: You must justify all your answers.
(Was analysis of matrix-vector multiplication with row-wise striped distribution.)
Due date: Thursday, 25 Oct 2001
How to turn in: Email me a tar file containing your code and makefile, at ashoks@modi4.ncsa.uiuc.edu, by 3 pm on the due date.
Note: (i) Your code should run on the Origin 2000 at NCSA,
and you must have a makefile. (ii) The tar file should not contain any
executable or object files. (iii) The tar file should be called
hw3.tar
. On untarring (tar xvf hw3.tar
) a directory called
hw3
should be created, with the required files in
it. (ii) Typing make
in the hw3
directory
should create the executable hw3
.
The program: You will write a program in C to multiply two
matrices (C = A*B) in parallel using MPI, with a block-checkerboard
distribution on each process. You may use as many MPI features as you
wish to. You should write a file called f.c
which defines
a function called double f(char matrix, int i, int
j)
. This function give the value of the (i,j)
th
entry of the A
matrix when its first argument is
'a'
, and the corresponding element of the B
matrix when the first argument is 'b'
. In your code, let
A
be defined by A(i,j) = 0.5*i+j
and
B
by B(i,j) = i+0.5j
. (The first element is
indexed (1,1), rather than (0,0).) The file f.c
should not contain anything else. I
will replace f.c
with a few other function definitions in
my tests.
Your code should take a command line argument N. A, B, and
C will then be N x N
matrices. You program should output
C in row major order to stdout
, with each row separated
by a '\n'. You may assume that the number of processors is a perfect
square, and that the square root of the number of processors divides
N
. Your code should perform the multiplication using the
systolic algorithm we discussed in class, and not by directly applying
a formula to the example application. Your code should work even when
I change f
. Furthermore, you should actually implement
the matrix multiplication algorithm yourself, rather a using a library
created by someone else!
Due date: Tuesday, 6 Nov 2001
How to turn in: Email me a C
file containing
your code (called hw4.c) at ashoks@modi4.ncsa.uiuc.edu, by 3
pm on the due date. You should also plot two speed-up curves, (i)
with automatic parallelization and (ii) with OpenMP
parallelization. Please turn in hardcopies of these figures at the
beginning of class 13 Nov 2001.
Note: (i) Your code should run on the Origin 2000 at NCSA.
The program: You will write a program in C to multiply two
matrices (C = A*B) and parallelize it (i) using automatic
parallelization, and (ii) using OpenMP directives. The code you turn
in should be the one with OpenMP directives.
You should hardcode the definition of the A and B matrices. Let
A
be defined by A(i,j) = 0.5*i+j
and
B
by B(i,j) = i+0.5j
. (The first element is
indexed (1,1), rather than (0,0).) Unlike with the previous homework,
you will not have a separate file called f.c
. Just write
the definition of A and B inside your initialization loop.)
Your code should take a command line argument N. A, B, and
C will then be N x N
matrices. You program should output
C in row major order to stdout
, with each row separated
by a '\n'. (Unlike with the previous homework, will not assume that
the number of threads is a perfect square, or that the square root of
the number of processors divides N
.) You will be graded
on the performance of your algorithm too. Note that you may need to
use a one-dimensional implementation of the matrices to get good
performance, especially with automatic parallelization.