I am playing with a simple numpy example and having hard time to understand why associative property of matrix multiplication
ABC = (AB)C = A(BC)
does not exactly hold. I assume the problem is with numeric stability. But how to address it? What is the issue exactly?
Here is my example with linear regression. I use sklearn solution as it gives more divergence between associative groupings:
import numpy as np np.random.seed(42) num_samples = 100 M = 1000 sigma = 0.5 X = np.random.binomial(2, 0.4, (num_samples, M)) beta = np.zeros(M) beta = 1.0 y = X.dot(beta) + sigma*np.random.randn(num_samples) "standardise y" y = y - np.mean(y) y = y/np.std(y) "center and standardise X" Xc = X - X.mean(axis=0) xstd = X.std(axis=0) mask = xstd > 1e-12 Xc = Xc[:, mask] from sklearn.linear_model import LinearRegression lr = LinearRegression() lr.fit(Xc ,y) beta_hat_sklearn = lr.coef_ beta_hat_sklearn.T @ Xc.T @ Xc @ beta_hat_sklearn / num_samples "equivalent < Python3.5" beta_hat_sklearn.T.dot(Xc.T).dot(Xc).dot(beta_hat_sklearn) / num_samples # 1.0000000000000009 beta_hat_sklearn.T @ (Xc.T @ Xc) @ beta_hat_sklearn / num_samples "equivalent < Python3.5" beta_hat_sklearn.T.dot(Xc.T.dot( Xc )).dot(beta_hat_sklearn )/ num_samples # 0.89517439485479278
It might be MacOSX specific bug.