Thursday, September 19, 2024
HomeMatlabWhat Is Quick Matrix Multiplication? – Nick Higham

# What Is Quick Matrix Multiplication? – Nick Higham

The definition of matrix multiplication says that for $ntimes n$ matrices $A$ and $B$, the product $C = AB$ is given by $c_{ij} = sum_{k=1}^n a_{ik}b_{kj}$. Every component of $C$ is an inside product of a row of $A$ and a column of $B$, so if this components is used then the price of forming $C$ is $n^2(2n-1)$ additions and multiplications, that’s, $O(n^3)$ operations. For over a century after the event of matrix algebra within the 1850s by Cayley, Sylvester and others, all strategies for matrix multiplication had been primarily based on this components and required $O(n^3)$ operations.

In 1969 Volker Strassen confirmed that when $n=2$ the product may be computed from the formulation

notag begin{gathered} p_1 =(a_{11}+a_{22})(b_{11}+b_{22}), p_2=(a_{21}+a_{22})b_{11}, qquad p_3=a_{11}(b_{12}-b_{22}), p_4=a_{22}(b_{21}-b_{11}), qquad p_5=(a_{11}+a_{12})b_{22}, p_6=(a_{21}-a_{11})(b_{11}+b_{12}), qquad p_7=(a_{12}-a_{22})(b_{21}+b_{22}), noalign{smallskip} C = begin{bmatrix} p_1+p_4-p_5+p_7 & p_3+p_5 p_2+p_4 & p_1+p_3-p_2+p_6 end{bmatrix}. end{gathered}

The analysis requires $7$ multiplications and $18$ additions as an alternative of $8$ multiplications and $4$ additions for the standard formulation.

At first sight, Strassen’s formulation might seem merely to be a curiosity. Nevertheless, the formulation don’t depend on commutativity so are legitimate when the $a_{ij}$ and $b_{ij}$ are matrices, by which case for big dimensions the saving of 1 multiplication tremendously outweighs the additional $14$ additions. Assuming $n$ is an influence of $2$, we are able to partition $A$ and $B$ into 4 blocks of dimension $n/2$, apply Strassen’s formulation for the multiplication, after which apply the identical formulation recursively on the half-sized matrix merchandise.

Allow us to look at the variety of multiplications for the recursive Strassen algorithm. Denote by $M(k)$ the variety of scalar multiplications required to multiply two $2^k times 2^k$ matrices. We’ve got $M(k) = 7M(k-1)$, so

$M(k) = 7M(k-1) = 7^2M(k-2) = cdots = 7^k M(0) = 7^k.$

However $7^k = 2^{log_2{7^k}} = (2^k)^{log_2 7} = n^{log_2 7} = n^{2.807dots}$. The variety of additions may be proven to be of the identical order of magnitude, so the algorithm requires $O(n^{log_27})=O(n^{2.807dots})$ operations.

Strassen’s work sparked curiosity to find matrix multiplication algorithms of even decrease complexity. Since there are $O(n^2)$ components of knowledge, which should every take part in at the very least one operation, the exponent of $n$ within the operation depend have to be at the very least $2$.

The present document higher certain on the exponent is $2.37286$, proved by Alman and Vassilevska Williams (2021) which improved on the earlier document of $2.37287$, proved by Le Gall (2014) The next determine plots the very best higher certain for the exponent for matrix multiplication over time.

Within the strategies that obtain exponents decrease than 2.775, numerous intricate methods are used, primarily based on representing matrix multiplication by way of bilinear or trilinear varieties and their illustration as tensors having low rank. Laderman, Pan, and Sha (1993) clarify that for these strategies “very giant overhead constants are hidden within the `$O$‘ notation”, and that the strategies “enhance on Strassen’s (and even the classical) algorithm just for immense numbers $N$.”

Strassen’s methodology, when rigorously applied, may be sooner than standard matrix multiplication for cheap dimensions. In apply, one doesn’t recur all the way down to $1times 1$ matrices, however moderately makes use of standard multiplication as soon as $n_0times n_0$ matrices are reached, the place the parameter $n_0$ is tuned for the very best efficiency.

Strassen’s methodology has the downside that it satisfies a weaker type of rounding error certain that standard multiplication. For standard multiplication $C = AB$ of $Ainmathbb{R}^{mtimes n}$ and $Binmathbb{R}^{ntimes p}$ we now have the componentwise certain

$notag |C - widehat{C}| le gamma^{}_n |A| @ |B|, qquad(1)$

the place $gamma^{}_n = nu/(1-nu)$ and $u$ is the unit roundoff. For Strassen’s methodology we now have solely a normwise error certain. The next outcome makes use of the norm $|A| = max_{i,j} |a_{ij}|$, which isn’t a constant norm.

Theorem 1 (Brent). Let $A,B in mathbb{R}^{ntimes n}$, the place $n=2^k$. Suppose that $C=AB$ is computed by Strassen’s methodology and that $n_0=2^r$ is the brink at which standard multiplication is used. The computed product $widehat{C}$ satisfies

$notag |C - widehat{C}| le left[ Bigl( displaystylefrac{n}{n_0} Bigr)^{log_2{12}} (n_0^2+5n_0) - 5n right] u |A|, |B| + O(u^2). qquad(2)$

With full recursion ($n_0=1$) the fixed in (2) is $6 n^{log_2 12}-5n approx 6 n^{3.585}-5n$, whereas with only one stage of recursion ($n_0 = n/2$) it’s $3n^2+25n$. These evaluate with $n^2u + O(u^2)$ for standard multiplication (obtained by taking norms in (1)). So the fixed for Strassen’s methodology grows at a sooner price than that for standard multiplication it doesn’t matter what the selection of $n_0$.

The truth that Strassen’s methodology doesn’t fulfill a componentwise error certain is a big weak point of the strategy. Certainly Strassen’s methodology can not even precisely multiply by the identification matrix. The product

$notag C = begin{bmatrix} 1 & 0 0 & 1 end{bmatrix} begin{bmatrix} 1 & epsilon epsilon & epsilon^2 end{bmatrix}, quad 0 < epsilon ll 1$

is evaluated precisely in floating-point arithmetic by standard multiplication, however Strassen’s methodology computes

$c_{22} = 2(1+epsilon^2) + (epsilon-epsilon^2) - 1 - (1+epsilon).$

As a result of $c_{22}$ includes subterms of order unity, the error $c_{22} - widehat c_{22}$ will likely be of order $u$. Thus the relative error $|c_{22} - widehat c_{22}| / |c_{22}| = O(u/epsilon^2) gg O(u)$,

One other weak point of Strassen’s methodology is that whereas the scaling $AB to (AD) (D^{-1}B)$, the place $D$ is diagonal, leaves (1) unchanged, it may well alter (2) by an arbitrary quantity. Dumitrescu (1998) suggests computing $D_1(D_1^{-1}Acdot B D_2^{-1})D_2$, the place the diagonal matrices $D_1$ and $D_2$ are chosen to equilibrate the rows of $A$ and the columns of $B$ within the $infty$-norm; he reveals that this scaling can enhance the accuracy of the outcome. Additional investigations alongside these strains are made by Ballard et al. (2016).

Ought to one use Strassen’s methodology in apply, assuming that an implementation is obtainable that’s sooner than standard multiplication? Not if one wants a componentwise error certain, which ensures correct merchandise of matrices with nonnegative entries and ensures that the column scaling of $A$ and row scaling of $B$ has no impact on the error. But when a normwise error certain with a sooner rising fixed than for standard multiplication is suitable then the strategy is price contemplating.

## Notes

For current work on high-performance implementation of Strassen’s methodology see Huang et al. (2016, 2020).

Theorem 1 is from an unpublished technical report of Brent (1970). A proof may be present in Higham (2002, §23.2.2).

For extra on quick matrix multiplication see Bini (2014) and Higham (2002, Chapter 23).

## References

It is a minimal set of references, which comprise additional helpful references inside.

This text is a part of the “What Is” sequence, out there from https://nhigham.com/class/what-is and in PDF kind from the GitHub repository https://github.com/higham/what-is.

RELATED ARTICLES