> library(NCStats) # Summarize, sd.calc > library(sciplot) # se
> library(NCStats) # Summarize, sd.calc > library(sciplot) # se
The working directory is set, the box3_1.txt data file is read, and the structure of the data frame is observed with,
> setwd("c://aaaWork//web//fishR//bookex//AIFFD//Box3_1")
> d <- read.table("box3_1.txt", header = TRUE)
> str(d)
'data.frame': 15 obs. of 1 variable:
$ catch: int 3 0 4 1 4 1 1 0 1 0 ...
The mean and standard deviation for the quantitative variable can be computed with Summarize() from the NCStats package. The single required argument is the vector containing the quantitative data. For example,
> Summarize(d$catch)
n Mean St. Dev. Min. 1st Qu. Median 3rd Qu. Max.
15.000000 1.600000 1.404076 0.000000 0.500000 1.000000 2.500000 4.000000
If you do not want to load the NCStats package then the same calculations can be made with summary() and sd() of the base packages contained in R,
> summary(d$catch)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0 0.5 1.0 1.6 2.5 4.0
> sd(d$catch)
[1] 1.404076
Regardless of which way you summarize the results, you can compute the standard error with se() from the sciplot package,
> se(d$catch) [1] 0.3625308
The critical t value for computing a confidence interval can be found with qt(). This function requires an upper-tail probability (i.e., (1-C)/2 where C is the level of confidence), the degrees-of-freedom, and the lower.tail=FALSE argument. For example, the t critical value in this situation is,
> df <- length(d$catch)-1 # compute df as n-1 > C <- 0.95 # set confidence level > a2 <- (1-C)/2 # find upper tail probability > ct <- qt(a2,df,lower.tail=FALSE) # find critical t-value > ct [1] 2.144787
The two endpoints of the confidence interval can then be computed with,
> mn <- mean(d$catch) # find (and save) the mean > semn <- se(d$catch) # find (and save) the SE of the mean > LCI <- mn-ct*semn > UCI <- mn+ct*semn > c(LCI,UCI) # combine together for showing [1] 0.8224488 2.3775512
The confidence interval is computed from the raw data, along with a great deal of other information, with t.test(). In this simplest form this function only requires the vector of quantitative data as the first argument and an optional level of confidence in the conf.level= argument (it defaults to 95%). For example,
> t.test(d$catch)
One Sample t-test
data: d$catch
t = 4.4134, df = 14, p-value = 0.0005894
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.8224488 2.3775512
sample estimates:
mean of x
1.6
The intermediate steps in computing a standard deviation, as shown at the beginning of Box 3.1, can be demonstrated with sd.calc()
> sd.calc(d$catch)
Demonstration of parts of a std. dev. calculation.
x diffs diffs.sq
1 3 1.4 1.96
2 0 -1.6 2.56
3 4 2.4 5.76
4 1 -0.6 0.36
5 4 2.4 5.76
6 1 -0.6 0.36
7 1 -0.6 0.36
8 0 -1.6 2.56
9 1 -0.6 0.36
10 0 -1.6 2.56
11 2 0.4 0.16
12 2 0.4 0.16
13 3 1.4 1.96
14 2 0.4 0.16
15 0 -1.6 2.56
sum 24 0.0 27.60
Mean = x-bar = 24 / 15 = 1.6
Variance = s^2 = 27.6 / 14 = 1.9714286
Std. Dev = s = sqrt(1.9714286) = 1.4040757