Why do Mathematica and Python's answers differ when dealing with singular matrix equations?


I have been dealing with linear algebra problems of the form A = Bx in Python and comparing this to a colleague's code in MATLAB and Mathematica. We have noticed differences between Python and the others when B is a singular matrix. When using numpy.linalg.solve() I throw a singular matrix error, so I've instead implemented .pinv() (the Moore Penrose pseudo inverse).

I understand that storing the inverse is computationally inefficient and am first of all curious if there's a better way of dealing with singular matrices in Python. However the thrust of my question lies in how Python chooses an answer from an infinite solution space, and why it chooses a different one than MATLAB and Mathematica do.

Here is my toy problem:

B = np.array([[2,4,6],[1,0,3],[0,7,0]])
A = np.array([[12],[4],[7]])
BI = linalg.pinv(B)
x = BI.dot(A)

The answer that Python outputs to me is:

[[ 0.4]
[ 1. ]
[ 1.2]]

While this is certainly a correct answer, it isn't the one I had intended: (1,1,1). Why does Python generate this particular solution? Is there a way to return the space of solutions rather than one possible solution? My colleague's code returned (1, 1, 1) - is there a reason that Python is different from Mathematica and MATLAB?

Show source
| numpy   | python   | matrix   | matlab   | linear-algebra   2016-10-25 01:10 1 Answers

Answers ( 1 )

  1. 2016-10-25 02:10

    In short, your code (and apparently np.linalg.lstsq) uses the Moore-Penrose pseudoinverse, which is implemented in np.linalg.pinv. MATLAB and Mathematica likely use Gaussian elimination to solve the system. We can replicate this latter approach in Python using the LU decomposition:

    B = np.array([[2,4,6],[1,0,3],[0,7,0]])
    y = np.array([[12],[4],[7]])
    P, L, U = scipy.linalg.lu(B)

    This decomposes B as B = P L U, where U is now an upper-diagonal matrix, and P L is invertible. In particular, we find:

    >>> U
    array([[ 2.,  4.,  6.],
           [ 0.,  7.,  0.],
           [ 0.,  0.,  0.]])


    >>> np.linalg.inv(P @ L) @ y
    array([[ 12.],
           [  7.],
           [  0.]])

    The goal is to solve this under-determined, transformed problem, U x = (P L)^{-1} y. The solution set is the same as our original problem. Let a solution be written as x = (x_1, x_2, x_3). Then we immediately see that any solution must have x_2 = 1. Then we must have 2 x_1 + 4 + 6 x_2 = 12. Solving for x_1, we get x_1 = 4 - 3 x_2. And so any solution is of the form (4 - 3 x_2, 1, x_2).

    The easiest way to generate a solution for the above is to simply choose x_2 = 1. Then x_1 = 1, and you recover the solution that MATLAB gives you: (1, 1, 1).

    On the other hand, np.linalg.pinv computes the Moore-Penrose pseudoinverse, which is the unique matrix satisfying the pseudionverse properties for B. The emphasis here is on unique. Therefore, when you say:

    my question lies in how Python chooses an answer from an infinite solution space

    the answer is that all of the choosing is actually done by you when you use the pseudoinverse, because np.linalg.pinv(B) is a unique matrix, and hence np.linalg.pinv(B) @ y is unique.

    To generate the full set of solutions, see the comment above by @ali_m.

◀ Go back