In: Computer Science
Strassen’s algorithm for matrix multiplication relies on using 7 multiplications (instead of 8 as in the straightforward algorithm) to multiply a pair of 2 by 2 matrices. Explain why it is not possible to further reduce this number, 7, to anything less than 4, in multiplying a pair of 2 by 2 matrices.
The multiplication of 4 by 4 matrices A and B is equivalent to the multiplication of a pair of 2 by 2 matrices whose elements are each 2 by 2 matrices.
In a four by four matrix the indices on row or column range from 1 to 4 In a two by two matrix of two by two matrices there are also four indices on row or column but they are called 11, 12, 21 and 22, the first index here representing the index of one 2 by 2 matrix and the second of the other.
Forming a matrix product is taking a dot product of row components of one with column components of the other. It does not matter in the slightest how you name the components; with either kind of index you multiply each row component of the first by the corresponding column component of the second. In one case this looks like
aj1 * b1k + aj2 * b2k + aj3 * b3k + aj4 * b4k.
while with the other names it looks like
ajk,11 * b11,st + ajk.12 *b12,st + ajk,21 * b21,st + ajk,22 * b22,st .
but who cares? The net result is the same. And as a matter of fact, given n by n matrices A and B for which n is factorable into r and s (n=r*s) you can similarly write them as r by r matrices whose components are each s by s matrices in the same general way without changing the definition of their matrix multiplication at all.
Explicitly, in the 4 by 4 case, if we describe A and B as
, and
Then their matrix product is exactly the same as the product of the matrices
and ,
where we have
, , , and .
and the same kind of thing for B.
Now suppose we have matrices matrices whose dimensions are 2k by 2k. We can represent the indices either as integers from 1 to 2k or as k-tuples, each element of which is 1 or 2. (this is very much like using a binary representation of our indices except that we use 1 instead of 0 and 2 instead of 1 and the all 1’s index here corresponds to 1 while the all 0’s in binary corresponds to 0)
void multiply(int A[][N], int B[][N], int C[][N])
{
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N;
j++)
{
C[i][j] =
0;
for (int k = 0;
k < N; k++)
{
C[i][j] += A[i][k]*B[k][j];
}
}
}
}