A Strategy for Developing Multiple Linear Regression (MLR) Models
--
1. Hypothesize the model and state the LINE assumptions:
· If you have a few predictors, start with the first-order (main effect) linear model; otherwise use the best subsets and stepwise regression methods to select a few alternative models.
· LINE assumptions are:
1. L: Model is linear in terms of the parameters;
2. I: Errors are distributed independently;
3. N: Errors are distributed normally;
4. E: Errors have equal variance.
2. Fit the model to data; that is, obtain the model parameter estimates using the LSE method.
3. Check the validity of LINE assumptions by performing residual analysis:
· Obtain standardized residuals.
· Check for normality assumption by using:
1. Normal probability plot (NPP) of standardized residuals
2. Histogram (comment if it is bell-shaped or skewed)
3. Shapiro-Wilk test (or Ryan-Joiner test)
· Check for constant (equal) error variance assumption by using:
1. Standardized residuals versus fitted values plot
2. If there are replications, Bartlett test (or F test for two groups) if normality assumption holds; Levene test otherwise
· Check for independence of errors assumption by using:
1. Observation order (run order or time) versus standardized residuals
2. Durbin-Watson (D-W) test