Sunday, October 2, 2022
HomeMatlabWhat Is Quick Matrix Multiplication? – Nick Higham

What Is Quick Matrix Multiplication? – Nick Higham

The definition of matrix multiplication says that for ntimes n matrices A and B, the product C = AB is given by c_{ij} = sum_{k=1}^n a_{ik}b_{kj}. Every component of C is an inside product of a row of A and a column of B, so if this components is used then the price of forming C is n^2(2n-1) additions and multiplications, that’s, O(n^3) operations. For over a century after the event of matrix algebra within the 1850s by Cayley, Sylvester and others, all strategies for matrix multiplication had been primarily based on this components and required O(n^3) operations.

In 1969 Volker Strassen confirmed that when n=2 the product may be computed from the formulation

notag begin{gathered} p_1 =(a_{11}+a_{22})(b_{11}+b_{22}),  p_2=(a_{21}+a_{22})b_{11}, qquad p_3=a_{11}(b_{12}-b_{22}),  p_4=a_{22}(b_{21}-b_{11}), qquad p_5=(a_{11}+a_{12})b_{22},  p_6=(a_{21}-a_{11})(b_{11}+b_{12}), qquad p_7=(a_{12}-a_{22})(b_{21}+b_{22}),  noalign{smallskip} C = begin{bmatrix} p_1+p_4-p_5+p_7 & p_3+p_5  p_2+p_4 & p_1+p_3-p_2+p_6 end{bmatrix}. end{gathered}

The analysis requires 7 multiplications and 18 additions as an alternative of 8 multiplications and 4 additions for the standard formulation.

At first sight, Strassen’s formulation might seem merely to be a curiosity. Nevertheless, the formulation don’t depend on commutativity so are legitimate when the a_{ij} and b_{ij} are matrices, by which case for big dimensions the saving of 1 multiplication tremendously outweighs the additional 14 additions. Assuming n is an influence of 2, we are able to partition A and B into 4 blocks of dimension n/2, apply Strassen’s formulation for the multiplication, after which apply the identical formulation recursively on the half-sized matrix merchandise.

Allow us to look at the variety of multiplications for the recursive Strassen algorithm. Denote by M(k) the variety of scalar multiplications required to multiply two 2^k times 2^k matrices. We’ve got M(k) = 7M(k-1), so

M(k) = 7M(k-1) = 7^2M(k-2) = cdots = 7^k M(0) = 7^k.

However 7^k = 2^{log_2{7^k}} = (2^k)^{log_2 7} = n^{log_2 7} = n^{2.807dots}. The variety of additions may be proven to be of the identical order of magnitude, so the algorithm requires O(n^{log_27})=O(n^{2.807dots}) operations.

Strassen’s work sparked curiosity to find matrix multiplication algorithms of even decrease complexity. Since there are O(n^2) components of knowledge, which should every take part in at the very least one operation, the exponent of n within the operation depend have to be at the very least 2.

The present document higher certain on the exponent is 2.37286, proved by Alman and Vassilevska Williams (2021) which improved on the earlier document of 2.37287, proved by Le Gall (2014) The next determine plots the very best higher certain for the exponent for matrix multiplication over time.


Within the strategies that obtain exponents decrease than 2.775, numerous intricate methods are used, primarily based on representing matrix multiplication by way of bilinear or trilinear varieties and their illustration as tensors having low rank. Laderman, Pan, and Sha (1993) clarify that for these strategies “very giant overhead constants are hidden within the `O‘ notation”, and that the strategies “enhance on Strassen’s (and even the classical) algorithm just for immense numbers N.”

Strassen’s methodology, when rigorously applied, may be sooner than standard matrix multiplication for cheap dimensions. In apply, one doesn’t recur all the way down to 1times 1 matrices, however moderately makes use of standard multiplication as soon as n_0times n_0 matrices are reached, the place the parameter n_0 is tuned for the very best efficiency.

Strassen’s methodology has the downside that it satisfies a weaker type of rounding error certain that standard multiplication. For standard multiplication C = AB of Ainmathbb{R}^{mtimes n} and Binmathbb{R}^{ntimes p} we now have the componentwise certain

notag        |C - widehat{C}| le gamma^{}_n |A| @ |B|, qquad(1)

the place gamma^{}_n = nu/(1-nu) and u is the unit roundoff. For Strassen’s methodology we now have solely a normwise error certain. The next outcome makes use of the norm |A| = max_{i,j} |a_{ij}|, which isn’t a constant norm.

Theorem 1 (Brent). Let A,B in mathbb{R}^{ntimes n}, the place n=2^k. Suppose that C=AB is computed by Strassen’s methodology and that n_0=2^r is the brink at which standard multiplication is used. The computed product widehat{C} satisfies

notag    |C - widehat{C}| le left[ Bigl( displaystylefrac{n}{n_0} Bigr)^{log_2{12}}                       (n_0^2+5n_0) - 5n right] u |A|, |B|                       + O(u^2). qquad(2)

With full recursion (n_0=1) the fixed in (2) is 6 n^{log_2 12}-5n approx 6 n^{3.585}-5n, whereas with only one stage of recursion (n_0 = n/2) it’s 3n^2+25n. These evaluate with n^2u + O(u^2) for standard multiplication (obtained by taking norms in (1)). So the fixed for Strassen’s methodology grows at a sooner price than that for standard multiplication it doesn’t matter what the selection of n_0.

The truth that Strassen’s methodology doesn’t fulfill a componentwise error certain is a big weak point of the strategy. Certainly Strassen’s methodology can not even precisely multiply by the identification matrix. The product

notag       C = begin{bmatrix} 1 & 0  0 & 1 end{bmatrix}           begin{bmatrix} 1 & epsilon  epsilon & epsilon^2 end{bmatrix},             quad 0 < epsilon ll 1

is evaluated precisely in floating-point arithmetic by standard multiplication, however Strassen’s methodology computes

c_{22} = 2(1+epsilon^2) + (epsilon-epsilon^2) - 1 - (1+epsilon).

As a result of c_{22} includes subterms of order unity, the error c_{22} - widehat c_{22} will likely be of order u. Thus the relative error |c_{22} - widehat c_{22}| / |c_{22}| = O(u/epsilon^2) gg O(u),

One other weak point of Strassen’s methodology is that whereas the scaling AB to (AD) (D^{-1}B), the place D is diagonal, leaves (1) unchanged, it may well alter (2) by an arbitrary quantity. Dumitrescu (1998) suggests computing D_1(D_1^{-1}Acdot B D_2^{-1})D_2, the place the diagonal matrices D_1 and D_2 are chosen to equilibrate the rows of A and the columns of B within the infty-norm; he reveals that this scaling can enhance the accuracy of the outcome. Additional investigations alongside these strains are made by Ballard et al. (2016).

Ought to one use Strassen’s methodology in apply, assuming that an implementation is obtainable that’s sooner than standard multiplication? Not if one wants a componentwise error certain, which ensures correct merchandise of matrices with nonnegative entries and ensures that the column scaling of A and row scaling of B has no impact on the error. But when a normwise error certain with a sooner rising fixed than for standard multiplication is suitable then the strategy is price contemplating.


For current work on high-performance implementation of Strassen’s methodology see Huang et al. (2016, 2020).

Theorem 1 is from an unpublished technical report of Brent (1970). A proof may be present in Higham (2002, §23.2.2).

For extra on quick matrix multiplication see Bini (2014) and Higham (2002, Chapter 23).


It is a minimal set of references, which comprise additional helpful references inside.

This text is a part of the “What Is” sequence, out there from and in PDF kind from the GitHub repository



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments