Required Packages and Setting Options

> library(car)      # Anova
> library(pda)      # lsmean
> library(NCStats)  # fit.plot
> options(contrasts=c("contr.sum","contr.poly"))

Preparing Data

The working directory is set, the box3_8.txt data file is read (note that this is the same date used in the Box3.8 vignette), and the structure of the data frame is observed with,

> setwd("c://aaaWork//web//fishR//bookex//AIFFD//Box3_10")
> d <- read.table("box3_8.txt", header = TRUE)
> str(d)
'data.frame':   22 obs. of  4 variables:
 $ lake     : Factor w/ 22 levels "Bass","Black",..: 21 13 16 1 15 20 14 12 19 10 ...
 $ region   : Factor w/ 2 levels "North","South": 1 1 1 1 1 1 1 1 1 1 ...
 $ bag_limit: int  1 1 1 2 2 2 2 5 5 5 ...
 $ catch    : num  2.21 2.32 2.74 2.23 2.25 1.4 2.36 1.78 1.64 1.97 ...

The structure of the data frame shows that bag_limit is an "integer" type variable. To properly perform the ANOVA this variable must be converted to a factor variable with,

> d$fbag_limit <- factor(d$bag_limit)

ANOVA Results I

The blocked ANOVA can then be fit using lm() (see linear models vignette for an introduction) and assigned to an object with,

> lm1 <- lm(catch ~ region * fbag_limit, data = d)
Important In R, the apparent multiplication of two factor variables in a linear model formula is a short-hand notation to tell R to include each factor as main effects AND the interaction between the two factors. R denotes the interaction term by separating the main factors with a colon.
Note I prefer to include the blocking variable first in the analysis as the general idea is to remove the variability associated with this variable. It does not, however, make a difference to the results in a complete block design using Type-III SS.

The type-III SS (see discussion about SS in the preliminaries vignette) are obtained by submitting the lm1 object to Anova() with the type="III" argument as follows,

> Anova(lm1, type = "III")
Anova Table (Type III tests)

Response: catch
                   Sum Sq Df  F value    Pr(>F)
(Intercept)       129.935  1 660.3029 1.948e-14
region              2.862  1  14.5428  0.001529
fbag_limit          1.935  2   4.9171  0.021647
region:fbag_limit   0.439  2   1.1142  0.352339
Residuals           3.148 16

The least-squares means (see discussion about least-squares means in the preliminaries vignette) are computed by submitting the lm object to lsmean() from the pda package. As there are two factors in this model, R must be told to compute the least-squares means separately by including the factor variable names separately in the factors= argument as such,

> lsmean(lm1, factors = "region")
      region     pred        se
North  North 2.109444 0.1349830
South  South 2.844667 0.1376562
> lsmean(lm1, factors = "fbag_limit")
  fbag_limit     pred        se
1          1 2.736667 0.1810987
2          2 2.620000 0.1694023
5          5 2.074500 0.1487878

The fit.plot() function from the NCStats package can be used to visually observe the mean (with default 95% CIs) catch rates of the different bag limit groups in the different regions. You can use the change.order= argument to change which variable is plotted on the x-axis (you may have to make this plot once before deciding which way you prefer it).

> fit.plot(lm1, change.order = TRUE, xlab = "Walleye Bag Limit", ylab = "Walleye Catch Rate", main = "")
Box3_10a-009.png

As the interaction term is insignificant, means (with 95% CI) plots for the main effects can also be obtained with fit.plot() using the which= argument as shown below,

> par(mfrow=c(1,2))      # make a one row, two column "grid" for the plots
> fit.plot(lm1,which="fbag_limit",xlab="Walleye Bag Limit",ylab="Walleye Catch Rate",main="")
> fit.plot(lm1,which="region",xlab="Region",ylab="Walleye Catch Rate",main="")
Box3_10a-010.png
Warning Many authors warn against interpretations between the levels of the blocking variable because the blocks were included in the analysis because there was an a priori idea of a difference between the levels and because the levels were generally not randomized. I have presented the block-levels comparisons here because they were presented in Chapter 3.

ANOVA Results II

The model fit without the interaction term, followed by the type-III SS ANOVA table, and least-squares means is obtained with,

> lm2 <- lm(catch ~ region + fbag_limit, data = d)
> Anova(lm2, type = "III")
Anova Table (Type III tests)

Response: catch
             Sum Sq Df  F value    Pr(>F)
(Intercept) 129.748  1 651.0897 1.386e-15
region        2.769  1  13.8958  0.001541
fbag_limit    1.957  2   4.9096  0.019877
Residuals     3.587 18
> lsmean(lm2, factors = c("fbag_limit"))
  fbag_limit     pred        se
1          1 2.736667 0.1822443
2          2 2.590978 0.1692787
5          5 2.060350 0.1491815
> lsmean(lm2, factors = c("region"))
      region     pred        se
North  North 2.105818 0.1352208
South  South 2.819512 0.1366475

These results are nearly identical to those shown in Box 3.10 of the AIFFD book.

ANOVA Results III

The same model fits summarized with type-II SS (see discussion about SS in the preliminaries vignette) are obtained with,

> Anova(lm1, type = "II")
Anova Table (Type II tests)

Response: catch
                   Sum Sq Df F value   Pr(>F)
region            2.76913  1 14.0722 0.001743
fbag_limit        1.95674  2  4.9719 0.020926
region:fbag_limit 0.43852  2  1.1142 0.352339
Residuals         3.14849 16
> Anova(lm2, type = "II")
Anova Table (Type II tests)

Response: catch
           Sum Sq Df F value   Pr(>F)
region     2.7691  1 13.8958 0.001541
fbag_limit 1.9567  2  4.9096 0.019877
Residuals  3.5870 18

The conclusions from these summaries are identical to the conclusions from above.