Linear models — e.g., analysis of variance, simple linear regression, analysis of covariance, multiple linear regression — are used throughout the boxed examples in the AIFFD book. In R, all of these models are implemented with one constructor function — lm() — that can receive a variety of formula types. The lm() function will then fit one of the linear models depending on the types of variables present in the formula. This page briefly describes the use of lm() for these variety of models.
An R formula consists of a left-hand-side (the response or dependent; LHS) and a right-hand-side (the explanatory, preditor, or independent; RHS) separated by a tilde. For the purposes of the boxed examples in AIFFD, the LHS will (nearly always) consist of a continuous response variable. The RHS, on the other hand, will consist of a single explanatory variable or some function of several explanatory variables. For our purposes, we need to note that explanatory variables can be "added" to the RHS by including a "plus sign" followed by the variable name and interaction terms are symbolized by the two variables forming the interaction separated by a colon (e.g., A:B represents the interaction between A and B). Finally, note that R uses a short-hand notation of A*B to note that the RHS should include the two main effect terms and an interaction term (i.e., A + B + A:B.
|
The apparent multiplication of two variables in the RHS of a model formula is short-hand notation for including both variables as main effects and the interaction between the two variables — i.e., A*B is the equavilent to saying A + B + A:B. |
The lm() function requires two arguments. The first argument is a model formula as described in the previous section. Different model formulae provide different analyses depending on the variables in the formula. The second argument, the data= argument, tells R which data frame the variables in the formula can be found. The results of lm() should be saved to an object so that that object can be submitted to a variety of extractor functions to return specific results.
| R Formula | Linear Model | Example |
|---|---|---|
Y~X1 |
Simple Linear Regression |
|
Y~G1 |
One-way ANOVA |
|
Y~G1*G2 |
Two-way ANOVA (with interaction) |
|
Y~G1+G2 |
Two-way ANOVA (withOUT interaction) |
Box 5.5 (last section) |
Y~X1*G1 |
One-way Indicator Variable Regression (ANCOVA-like model) |
|
Y~X1+G1 |
One-way ANCOVA |
|
Y~X1*G1*G2 |
Two-way Indicator Variable Regression |
|
Y~X1*X2 |
Multiple Linear Regression (with interaction) |
A number of functions can be used to extract specific information from an object saved from a lm() call.
| Function Call | Package | Description |
|---|---|---|
anova(lm1) |
base |
Extracts the ANOVA table using type-I SS. |
coef(lm1) |
base |
Extracts the values of the parameter coefficients. |
confint(lm1) |
base |
Extracts confidence intervals for the parameter coefficients. |
summary(lm1) |
base |
Extracts the parameter coefficient values, SEs, and default t-test and p-values. Also, extracts coefficient of determination (unadjusted and adjusted), overall F-test and p-value, and rMSE. |
predict(lm1) |
base |
Extracts predictions using the linear model for each individual in the data frame. Modifications (i.e., other arguments) allow predicting other values. |
Anova(lm1,type="III") |
car |
Extracts the ANOVA table using type-III SS. See discussion about different SS calculations in the preliminaries vignette. |
Anova(lm1,type="II") |
car |
Extracts the ANOVA table using type-II SS. See discussion about different SS calculations in the preliminaries vignette. |
lsmean(lm1) |
pda |
Extracts least-squares means. See discussion about least-squares means in the preliminaries vignette |
fitPlot(lm1) |
NCStats |
Constructs a "fitted-line plot" (specifics depends on model; does not work for all model types) |
residualPlot(lm1) |
NCStats |
Constructs a residual plot. |
hist(lm1$resituals) |
base |
Constructions histogram of model residuals. |
ad.test(lm1$residuals) |
nortest |
Performs Anderson-Darling test of normality on model residuals. |
leveneTest(lm1) |
car |
Performs Levene’s Homogeneity of Variance test on model groups. |
outlierTest(lm1) |
car |
Performs a test for outliers on model residuals. |