|
Article Excerpt The cross-product term in moderated regression may be collinear with its constituent parts, making it difficult to detect main, simple, and interaction effects. The literature shows that mean-centering can reduce the covariance between the linear and the interaction terms, thereby suggesting that it reduces collinearity. We analytically prove that mean-centering neither changes the computational precision of parameters, the sampling accuracy of main effects, simple effects, interaction effects, nor the [R.sup.2]. We also show that the determinants of the cross product matrix X'X are identical for uncentered and mean-centered data, so the collinearity problem in the moderated regression is unchanged by mean-centering. Many empirical marketing researchers commonly mean-center their moderated regression data hoping that this will improve the precision of estimates from ill conditioned, collinear data, but unfortunately, this hope is futile. Therefore, researchers using moderated regression models should not mean-center in a specious attempt to mitigate collinearity between the linear and the interaction terms. Of course, researchers may wish to mean-center for interpretive purposes and other reasons.
Key words: moderated regression; mean-centering; collinearity
1. Introduction
Multiple regression models with interactions, also known as moderated models, are widely used in marketing and have been the subject of much scholarly discussion (Sharma et al. 1981, Irwin and McClelland 2001). The interaction (or moderator) effect in a moderated regression model is estimated by including a cross-product term as an additional exogenous variable as in
y = [[alpha]'.sub.1][x.sub.1] + [[alpha]'.sub.2][x.sub.2] + [x'.sub.1][[alpha].sub.3][x.sub.2] + [[alpha].sub.0] + [[alpha]'.sub.c][x.sub.c] + [[epsilon], (1)
where [[alpha].sub.i] and [x.sub.i] are [k.sub.i] x 1 column vectors for i = 1, 2, [[alpha].sub.3] is a [k.sub.1] x [k.sub.2] matrix of coefficients that determine the interaction terms, and [x.sub.c] plays the role of other covariates that are not part of the moderated element. The moderator term, [x'.sub.1][[alpha].sub.3][x.sub.2], is likely to covary to some degree with the variable [x.sub.1] (and with the variable [x.sub.2]). This relationship has been interpreted as a form of multicollinearity, and collinearity makes it difficult to distinguish the separate effects of the linear and interaction terms involving [x.sub.1] and [x.sub.2].
In response to this problem, various researchers including Aiken and West (1991), Cronbach (1987), and Jaccard et al. (1990) recommend mean centering the variables [x.sub.1] and [x.sub.2] as an approach to alleviating collinearity related concerns. Mean centering (1) gives the following:
y = [[beta]'.sub.1]([x.sub.1]-[[bar.x].sub.1]+[[beta]'.sub.2] ([x.sub.2]-[[bar.x].sub.2) + ([x.sub.1] - [[bar.x].sub.1])'[[beta].sub.3] ([x.sub.2]-[[bar.x].sub.2])+[[beta].sub.0]+[[beta]'.sub.c][x.sub.c] + [upsilon]. (2)
In comparison to Equation (1), the linear term [x.sub.1] - [[bar.x].sub.1] in Equation (2) will typically have smaller covariance with the interaction term because the multiplier of [x.sub.1] - [[bar.x].sub.1] in the interaction term, [[beta].sub.3]([x.sub.2] - [bar.x].sub.2]), is zero on average.
This practice of mean-centering has become routine in the social sciences. It is common to see statements from marketing researchers such as, "we mean-centered all independent variables that constituted an interaction term to mitigate the potential threat of multicollinearity" (cf. Kopalle and Lehmann 2006). Can such a simple shift in the location of the origin really help us see the pattern between variables? We use a hypothetical example to suggest an answer to this question. Let the true model for this simulated data be: y = [x.sub.1] + (1/2)[x.sub.1][x.sub.2] + [epsilon] where [epsilon] ~ N (0, 0.1). In Figure 1(a), we graph the relationship between y and uncentered ([x.sub.1], [x.sub.2]). In Figure 1(b), we see the relationship between y and mean-centered ([x.sub.1], [x.sub.2]). Obviously, the same pattern of data is seen in both the graphs, since shifting the origin of the exogenous variables [x.sub.1] and [x.sub.2] does not change the relative position of any of the data points. Intuitive geometric sense tells us that looking for statistical patterns in the centered data will not be easier or harder than in the uncentered data.
[FIGURE 1 OMITTED]
In this paper, we will demonstrate analytically that the geometric intuition is correct: mean-centering in moderated regression does not help in alleviating collinearity. Although Belsley (1984) has shown that mean-centering does not help in additive models, to our knowledge, this is the first time anyone has analytically demonstrated that mean-centering does not alleviate collinearity problems in multiplicative models. Specifically, we demonstrate that (1) in contrast to Aiken and West's (1991) suggestion, mean-centering does not improve the accuracy of numerical computation of statistical parameters, (2) it does not change the sampling accuracy of main effects, simple effects, and/or interaction effects (point estimates and standard errors are identical with or without mean-centering), and (3) it does not change overall measures of fit such as [R.sup.2] and adjusted-[R.sup.2]. It does not hurt, but it does not help, not one iota.
The rest of the paper is organized as follows. We prove analytically that mean centering neither improves computational accuracy nor changes the ability to detect relationships between variables in moderated regression. Next, using data from a study of brand extensions, we illustrate the equivalency of the uncentered and the mean-centered models and demonstrate how one set of coefficients and their standard errors can be recovered from the other. Finally, we discuss the reasons why so many marketing scholars mean-center their variables and the conditions under...
|