I have a case in causal mediation analysis and I wanna estimate treatment effect using gmm (Generalized Method of Moments). So, I referenced this code. https://github.com/josef-pkt/misc/blob/master/notebooks/ex_gmm_gamma.ipynband this question Issue with using statsmodels.sandbox.regression.gmm.GMM
and following is my code.
from statsmodels.sandbox.regression.gmm import GMMclass GMMAB(GMM): def __init__(self, *args, **kwds): # set appropriate counts for moment conditions and parameters kwds.setdefault('k_moms', 6) kwds.setdefault('k_params', 6) super(GMMAB, self).__init__(*args, **kwds) def momcond(self, params): c = params y,m = self.endog.T #[y,m] x = self.exog # x #inst = self.instrument g1 = m - c[1] - c[0]*x g2 = x*(m - c[1] - c[0]*x) g3 = y - c[2] - c[3]*x - c[4]*m- c[5]*m*x g4 = x*(y - c[2] - c[3]*x - c[4]*m- c[5]*m*x) g5 = m*(y - c[2] - c[3]*x - c[4]*m- c[5]*m*x) g6 = m*x*(y - c[2] - c[3]*x - c[4]*m- c[5]*m*x) g = np.column_stack((g1, g2, g3, g4, g5, g6)) return gbeta0 = np.array([0.1, 0.1, 0.1, 0.1, 0.1, 0.1])dta = pd.read_csv('mediation_data.csv')y = np.array(dta.y)m = np.array(dta.m)s = np.array(dta[['y','m']])x = np.array(dta.x)model = GMMAB(endog = s, exog = x, instrument = x, k_moms=6, k_params=6)beta0 = np.array([0.1,0.1,0.1,0.1,0.1,0.1])model.fit(beta0, maxiter=2, weights_method='hac', optim_method='nm')
the program always crashes during model training process as the 'y''m''x' are all big arrays (> 100,000 dimensions). But I do not know how to improve the code.
Could you please give me some suggestions, like the gmm settings to run this code for big dataset or more numerical effient gmm code to solve this problem.