// library(ascii) // setwd("c://aaaWork//web//fishR//bookex//AIFFD//Box3_10") // Asciidoc("Box3_10a.Rnw") AIFFD Box 3-10 Vignette ======================== :Author: Derek H. Ogle :Email: dogle@northland.edu :Date: 16-June-2010 :Revision: 3 == Required Packages and Setting Options ---- > library(car) # Anova > library(pda) # lsmean > library(NCStats) # fit.plot > options(contrasts=c("contr.sum","contr.poly")) ---- == Preparing Data The working directory is set, the link:box3_8.txt[] data file is read (note that this is the same date used in the link:../Box3_8/Box3_8a.html[Box3.8 vignette]), and the structure of the data frame is observed with, ---- > setwd("c://aaaWork//web//fishR//bookex//AIFFD//Box3_10") > d <- read.table("box3_8.txt", header = TRUE) > str(d) 'data.frame': 22 obs. of 4 variables: $ lake : Factor w/ 22 levels "Bass","Black",..: 21 13 16 1 15 20 14 12 19 10 ... $ region : Factor w/ 2 levels "North","South": 1 1 1 1 1 1 1 1 1 1 ... $ bag_limit: int 1 1 1 2 2 2 2 5 5 5 ... $ catch : num 2.21 2.32 2.74 2.23 2.25 1.4 2.36 1.78 1.64 1.97 ... ---- The structure of the data frame shows that +*bag_limit*+ is an "integer" type variable. To properly perform the ANOVA this variable must be converted to a factor variable with, ---- > d$fbag_limit <- factor(d$bag_limit) ---- == ANOVA Results I The blocked ANOVA can then be fit using +*[red]#lm()#*+ (see link:../preliminaries/LinearModels.html[linear models vignette] for an introduction) and assigned to an object with, ---- > lm1 <- lm(catch ~ region * fbag_limit, data = d) ---- IMPORTANT: In R, the apparent multiplication of two factor variables in a linear model formula is a short-hand notation to tell R to include each factor as main effects AND the interaction between the two factors. R denotes the interaction term by separating the main factors with a colon. NOTE: I prefer to include the blocking variable first in the analysis as the general idea is to remove the variability associated with this variable. It does not, however, make a difference to the results in a complete block design using Type-III SS. The type-III SS (see discussion about SS in the link:../preliminaries/preliminaries.html[preliminaries vignette]) are obtained by submitting the +*lm1*+ object to +*[red]#Anova()#*+ with the +*[red]#type="III"#*+ argument as follows, ---- > Anova(lm1, type = "III") Anova Table (Type III tests) Response: catch Sum Sq Df F value Pr(>F) (Intercept) 129.935 1 660.3029 1.948e-14 region 2.862 1 14.5428 0.001529 fbag_limit 1.935 2 4.9171 0.021647 region:fbag_limit 0.439 2 1.1142 0.352339 Residuals 3.148 16 ---- The least-squares means (see discussion about least-squares means in the link:../preliminaries/preliminaries.html[preliminaries vignette]) are computed by submitting the +*[red]#lm#*+ object to +*[red]#lsmean()#*+ from the +*pda*+ package. As there are two factors in this model, R must be told to compute the least-squares means separately by including the factor variable names separately in the +*[red]#factors=#*+ argument as such, ---- > lsmean(lm1, factors = "region") region pred se North North 2.109444 0.1349830 South South 2.844667 0.1376562 > lsmean(lm1, factors = "fbag_limit") fbag_limit pred se 1 1 2.736667 0.1810987 2 2 2.620000 0.1694023 5 5 2.074500 0.1487878 ---- The +*[red]#fit.plot()#*+ function from the +*NCStats*+ package can be used to visually observe the mean (with default 95% CIs) catch rates of the different bag limit groups in the different regions. You can use the +*[red]#change.order=#*+ argument to change which variable is plotted on the x-axis (you may have to make this plot once before deciding which way you prefer it). ---- > fit.plot(lm1, change.order = TRUE, xlab = "Walleye Bag Limit", ylab = "Walleye Catch Rate", main = "") ---- image::Box3_10a-009.png[] As the interaction term is insignificant, means (with 95% CI) plots for the main effects can also be obtained with +*[red]#fit.plot()#*+ using the +*[red]#which=#*+ argument as shown below, ---- > par(mfrow=c(1,2)) # make a one row, two column "grid" for the plots > fit.plot(lm1,which="fbag_limit",xlab="Walleye Bag Limit",ylab="Walleye Catch Rate",main="") > fit.plot(lm1,which="region",xlab="Region",ylab="Walleye Catch Rate",main="") ---- image::Box3_10a-010.png[] WARNING: Many authors warn against interpretations between the levels of the blocking variable because the blocks were included in the analysis because there was an _a priori_ idea of a difference between the levels and because the levels were generally not randomized. I have presented the block-levels comparisons here because they were presented in Chapter 3. == ANOVA Results II The model fit without the interaction term, followed by the type-III SS ANOVA table, and least-squares means is obtained with, ---- > lm2 <- lm(catch ~ region + fbag_limit, data = d) > Anova(lm2, type = "III") Anova Table (Type III tests) Response: catch Sum Sq Df F value Pr(>F) (Intercept) 129.748 1 651.0897 1.386e-15 region 2.769 1 13.8958 0.001541 fbag_limit 1.957 2 4.9096 0.019877 Residuals 3.587 18 > lsmean(lm2, factors = c("fbag_limit")) fbag_limit pred se 1 1 2.736667 0.1822443 2 2 2.590978 0.1692787 5 5 2.060350 0.1491815 > lsmean(lm2, factors = c("region")) region pred se North North 2.105818 0.1352208 South South 2.819512 0.1366475 ---- These results are nearly identical to those shown in Box 3.10 of the AIFFD book. == ANOVA Results III The same model fits summarized with type-II SS (see discussion about SS in the link:../preliminaries/preliminaries.html[preliminaries vignette]) are obtained with, ---- > Anova(lm1, type = "II") Anova Table (Type II tests) Response: catch Sum Sq Df F value Pr(>F) region 2.76913 1 14.0722 0.001743 fbag_limit 1.95674 2 4.9719 0.020926 region:fbag_limit 0.43852 2 1.1142 0.352339 Residuals 3.14849 16 > Anova(lm2, type = "II") Anova Table (Type II tests) Response: catch Sum Sq Df F value Pr(>F) region 2.7691 1 13.8958 0.001541 fbag_limit 1.9567 2 4.9096 0.019877 Residuals 3.5870 18 ---- The conclusions from these summaries are identical to the conclusions from above.