// library(ascii) // setwd("c://aaaWork//web//fishR//bookex//AIFFD//Box3_10") // Sweave("Box3_10a.Rnw",driver=RweaveAsciidoc) AIFFD Box 3-10 Vignette ======================== :Author: Derek H. Ogle :Email: dogle@northland.edu :Date: 17-June-2009 :Revision: 2 == Required Packages and Setting Options ---- > library(car) # Anova > library(pda) # lsmean > library(NCStats) # fit.plot > options(contrasts=c("contr.sum","contr.poly")) ---- == Preparing Data The working directory is set, the link:box3_8.txt[] data file is read (note that this is the same date used in the link:../Box3_8/Box3_8a.html[Box3.8 vignette]), and the structure of the data frame is observed with, ---- > setwd("c://aaaWork//web//fishR//bookex//AIFFD//Box3_10") > d <- read.table("box3_8.txt", header = TRUE) > str(d) 'data.frame': 22 obs. of 4 variables: $ lake : Factor w/ 22 levels "Bass","Black",..: 21 13 16 1 15 20 14 12 19 10 ... $ region : Factor w/ 2 levels "North","South": 1 1 1 1 1 1 1 1 1 1 ... $ bag_limit: int 1 1 1 2 2 2 2 5 5 5 ... $ catch : num 2.21 2.32 2.74 2.23 2.25 1.4 2.36 1.78 1.64 1.97 ... ---- The structure of the data frame shows that __bag_limit__ is an "integer" type variable. To properly perform the ANOVA this variable must be converted to a factor variable with, ---- > d$fbag_limit <- factor(d$bag_limit) ---- == ANOVA Results I The blocked ANOVA can then be fit using +*[red]#lm()#*+ (for "linear model"). The explanatory variable side (right-hand-side) of the linear model formula uses a notation where the two factor variables appear to be multiplied together. In SAS, this apparent multiplication is how an interaction term is constructed. In R, however, an interaction is noted by separating the two factors by a colon. The apparent multiplication notation is a short-hand method of telling R to include each main effect and the interaction between the two factors. In the example below, the +*[red]#region*fbag_limit#*+ portion of the formula is equivalent to saying +*[red]#region + fbag_limit + region:fbag_limit#*+. The model using this short-hand is fit with, ---- > lm1 <- lm(catch ~ region * fbag_limit, data = d) ---- NOTE: I prefer to include the blocking variable first in the analysis as the general idea is to remove the variability associated with this variable. It does not, however, make a difference to the results in a complete block design as shown here. The type-III SS are obtained with (see discussion about SS in the link:../preliminaries/preliminaries.html[preliminaries vignette]), ---- > Anova(lm1, type = "III") Anova Table (Type III tests) Response: catch Sum Sq Df F value Pr(>F) (Intercept) 129.935 1 660.3029 1.948e-14 region 2.862 1 14.5428 0.001529 fbag_limit 1.935 2 4.9171 0.021647 region:fbag_limit 0.439 2 1.1142 0.352339 Residuals 3.148 16 ---- The least-squares means are computed with +*[red]#lsmean()#*+ from the +*pda*+ package (see discussion about least-squares means in the link:../preliminaries/preliminaries.html[preliminaries vignette]). As there are two factors in this model, R must be told to compute the least-squares means separately by including the factor variable names separately in the +*[red]#factors=#*+ argument as such, ---- > lsmean(lm1, factors = "region") region pred se North North 2.109444 0.1349830 South South 2.844667 0.1376562 > lsmean(lm1, factors = "fbag_limit") fbag_limit pred se 1 1 2.736667 0.1810987 2 2 2.620000 0.1694023 5 5 2.074500 0.1487878 ---- The +*[red]#fit.plot()#*+ function from the +*NCStats*+ package can be used to visually observe the mean (with default 95% CIs) catch rates of the different bag limit groups in the different regions. You can use the +*[red]#change.order=#*+ argument to change which variable is plotted on the x-axis (you may have to make this plot once before deciding which way you prefer it). ---- > fit.plot(lm1, change.order = TRUE, xlab = "Walleye Bag Limit", ylab = "Walleye Catch Rate", main = "") ---- image::Box3_10a-009.png[] As the interaction term is insignificant, means (with 95% CI) plots for the main effects can also be obtained with +*[red]#fit.plot()#*+ using the +*[red]#which=#*+ argument as shown below, ---- > par(mfrow=c(1,2)) # make a one row, two column "grid" for the plots > fit.plot(lm1,which="fbag_limit",xlab="Walleye Bag Limit",ylab="Walleye Catch Rate",main="") > fit.plot(lm1,which="region",xlab="Region",ylab="Walleye Catch Rate",main="") ---- image::Box3_10a-010.png[] WARNING: Many authors warn against interpretations between the levels of the blocking variable because the blocks were included in the analysis because there was an _a priori_ idea of a difference between the levels and because the levels were generally not randomized. I have presented the block-levels comparisons here because they were presented in Chapter 3. == ANOVA Results II The model fit without interaction terms followed by the type-III SS ANOVA table and least-squares means is obtained with, ---- > lm2 <- lm(catch ~ region + fbag_limit, data = d) > Anova(lm2, type = "III") Anova Table (Type III tests) Response: catch Sum Sq Df F value Pr(>F) (Intercept) 129.748 1 651.0897 1.386e-15 region 2.769 1 13.8958 0.001541 fbag_limit 1.957 2 4.9096 0.019877 Residuals 3.587 18 > lsmean(lm2, factors = c("fbag_limit")) fbag_limit pred se 1 1 2.736667 0.1822443 2 2 2.590978 0.1692787 5 5 2.060350 0.1491815 > lsmean(lm2, factors = c("region")) region pred se North North 2.105818 0.1352208 South South 2.819512 0.1366475 ---- These results are nearly identical to those shown in Box 3.10. == ANOVA Results III The same model fits summarized with type-II SS (see discussion about SS in the link:../preliminaries/preliminaries.html[preliminaries vignette])) are obtained with, ---- > Anova(lm1, type = "II") Anova Table (Type II tests) Response: catch Sum Sq Df F value Pr(>F) region 2.76913 1 14.0722 0.001743 fbag_limit 1.95674 2 4.9719 0.020926 region:fbag_limit 0.43852 2 1.1142 0.352339 Residuals 3.14849 16 > Anova(lm2, type = "II") Anova Table (Type II tests) Response: catch Sum Sq Df F value Pr(>F) region 2.7691 1 13.8958 0.001541 fbag_limit 1.9567 2 4.9096 0.019877 Residuals 3.5870 18 ---- The conclusions from these summaries are identical to the conclusions from above.