Recently, I have implemented 3 different ways of multi-threaded matrix multiplication. There are 3 ways of thinking when writing a parallel program: –
- Input Decomposition
- Output Decomposition
- Intermediate Decomposition
We want to create matrix multiplication (3 x 3) program in multi-threaded way.
Input: Matrix A, B and each one of them is 3×3 size.
Output: Matrix C which is the resultant of matrix A * B
There are several way when multiply 2 matrices, one of them is Block Matrix, on which you divide the matrix to sub-matrices (under some constraints), then multiplying the sub-matrices and finally sum them up in matrix C.
Here, the matrices are 3×3, so, we can divide them to A = [A1, A2, A3] transpose, B = [B1, B2, B3], where A1, A2, A3 are of size 3×1 and B1, B2, B3 are of size 1×3. Each thread calculates A1*B1, A2*B2, A3*B3. The result of these multiplications is 3×3 matrix.
The only thing remaining now is add the 3 3×3 matrices together to get a matrix C.
We know that the elements of each cell in the matrix C rows depends on a row of matrix A and the matrix B columns. So, we make each thread calculates the row values of matrix C. And the input of each thread is a row from matrix A and the whole matrix B.
It is similar to Input Decomposition, the only difference here is that every thread will return a unique matrix C (intermediate matrix). In other words, every thread will return an intermediate matrix with size 3×3. So, we will have 3 temp matrix of size 3×3, then the matrix C = (thread_1_matrix_C + thread_2_matrix_C, thread_3_matrix_C)
Here https://github.com/AhmedHani/CS617-HPC/tree/master/Assignments/parallel_matrix_multiplication/multi-threaded_matrix_multiplication is the source code that implements the 3 techniques
The input is: A = [[1, 2, 3], [4, 5, 6], [7, 8, 9]], B = [[9, 8, 7], [6, 5, 4], [3, 2, 1]]
The output: C = A * B
– Algorithms and Parallel Computing by Fayez Gebali
– Introduction to Parallel Computing, Second Edition by Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar