Required Packages

> library(NCStats)  # Summarize, sd.calc
> library(sciplot)  # se

Preparing Data

The working directory is set, the box3_1.txt data file is read, and the structure of the data frame is observed with,

> setwd("c://aaaWork//web//fishR//bookex//AIFFD//Box3_1")
> d <- read.table("box3_1.txt", header = TRUE)
> str(d)
'data.frame':   15 obs. of  1 variable:
 $ catch: int  3 0 4 1 4 1 1 0 1 0 ...

Compute Summary Statistics

The mean and standard deviation for the quantitative variable can be computed with Summarize() from the NCStats package. The single required argument is the vector containing the quantitative data. For example,

> Summarize(d$catch)
        n      Mean  St. Dev.      Min.   1st Qu.    Median   3rd Qu.      Max.
15.000000  1.600000  1.404076  0.000000  0.500000  1.000000  2.500000  4.000000

If you do not want to load the NCStats package then the same calculations can be made with summary() and sd() of the base packages contained in R,

> summary(d$catch)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    0.0     0.5     1.0     1.6     2.5     4.0
> sd(d$catch)
[1] 1.404076

Regardless of which way you summarize the results, you can compute the standard error with se() from the sciplot package,

> se(d$catch)
[1] 0.3625308

Constructing Confidence Intervals for the Population Mean

The critical t value for computing a confidence interval can be found with qt(). This function requires an upper-tail probability (i.e., (1-C)/2 where C is the level of confidence), the degrees-of-freedom, and the lower.tail=FALSE argument. For example, the t critical value in this situation is,

> df <- length(d$catch)-1          # compute df as n-1
> C <- 0.95                        # set confidence level
> a2 <- (1-C)/2                    # find upper tail probability
> ct <- qt(a2,df,lower.tail=FALSE) # find critical t-value
> ct
[1] 2.144787

The two endpoints of the confidence interval can then be computed with,

> mn <- mean(d$catch)              # find (and save) the mean
> semn <- se(d$catch)              # find (and save) the SE of the mean
> LCI <- mn-ct*semn
> UCI <- mn+ct*semn
> c(LCI,UCI)                       # combine together for showing
[1] 0.8224488 2.3775512

The confidence interval is computed from the raw data, along with a great deal of other information, with t.test(). In this simplest form this function only requires the vector of quantitative data as the first argument and an optional level of confidence in the conf.level= argument (it defaults to 95%). For example,

> t.test(d$catch)
        One Sample t-test

data:  d$catch
t = 4.4134, df = 14, p-value = 0.0005894
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 0.8224488 2.3775512
sample estimates:
mean of x
      1.6

Demonstration of Standard Deviation Calculation

The intermediate steps in computing a standard deviation, as shown at the beginning of Box 3.1, can be demonstrated with sd.calc()

> sd.calc(d$catch)
Demonstration of parts of a std. dev. calculation.

     x diffs diffs.sq
1    3   1.4     1.96
2    0  -1.6     2.56
3    4   2.4     5.76
4    1  -0.6     0.36
5    4   2.4     5.76
6    1  -0.6     0.36
7    1  -0.6     0.36
8    0  -1.6     2.56
9    1  -0.6     0.36
10   0  -1.6     2.56
11   2   0.4     0.16
12   2   0.4     0.16
13   3   1.4     1.96
14   2   0.4     0.16
15   0  -1.6     2.56
sum 24   0.0    27.60

Mean = x-bar = 24 / 15 = 1.6

Variance = s^2 = 27.6 / 14 = 1.9714286

Std. Dev = s = sqrt(1.9714286) = 1.4040757