Chapter 4 Permutation Tests
4.1 Hypothesis Testing
- A test statistic is a numerical function of the data, whose value determines the result of the test. It is denoted by \(T(X)\). The statistic computed by the realized sample is denoted by \(t\)
- p-value: This is the probability that chance alone would produce a test statistic as extreme as the observed test statistic, if the null hypothesis were true
- Statistically Significant : A result is statistically significant if it would rarely occur by chance
- It is important to keep in mind that p-value by itself does not help you in rejecting the alternate hypothesis. You must also consider the cost of accepting / not accepting a false positive
- Null Distribution is the distribution of the test statistic if the null hypothesis is true
4.2 Permutation Tests
4.2.1 Test on Beerwings dataset
set.seed(1234)
<- as_tibble(Beerwings)%>%group_by(Gender)%>%summarise(mu=mean(Hotwings))%>%dplyr::select(mu)%>%unlist()
mus <- diff(mus)
observed_stat <- Beerwings$Hotwings
hwings <- 10^3 - 1
n <- 30
m <- replicate(n, {
perms <- sample(hwings, 30)
x mean(x[1:(m/2)])-mean(x[(m/2+1):m])
})<- perms > observed_stat
results <- c(TRUE, results)
results <- mean(results)
pval print(pval)
## [1] 0.003
The fact that the pval is 0.003, might warrant a case to reject null hypothesis.
4.2.2 Test on Verizon dataset
set.seed(1234)
<- as_tibble(Verizon)%>%group_by(Group)%>%summarise(mu=mean(Time))%>%dplyr::select(mu)%>%unlist()
mus <- mus[1]-mus[2]
observed_stat <- Verizon$Time
realized_data <- 10^3 - 1
n <- as_tibble(Verizon)%>%group_by(Group)%>%summarise(n=n())%>%dplyr::select(n)%>%unlist()
classes <- replicate(n, {
perms <- sample(realized_data, sum(classes))
x mean(x[1:classes[1]])-mean(x[(classes[1]+1):sum(classes)])
})<- (perms > observed_stat)
results <- c(TRUE, results)
results <- mean(results)
pval print(pval)
## [1] 0.018
The fact that the pval is 0.018, might warrant a case to reject null hypothesis.
4.2.3 Test on Beerwings dataset -Two sided
set.seed(1234)
<- as_tibble(Beerwings)%>%group_by(Gender)%>%summarise(mu=mean(Hotwings))%>%dplyr::select(mu)%>%unlist()
mus <- diff(mus)
observed_stat <- Beerwings$Hotwings
hwings <- 10^3 - 1
n <- 30
m <- replicate(n, {
perms <- sample(hwings, 30)
x mean(x[1:(m/2)])-mean(x[(m/2+1):m])
})<- (perms > observed_stat | perms < -observed_stat)
results <- c(TRUE, results)
results <- mean(results)
pval print(pval)
## [1] 0.003
The fact that the pval is 0.003, might warrant a case to reject null hypothesis.
4.2.4 Test on Recidivism dataset
set.seed(1234)
%>%group_by(Age25)%>%summarise(n=n())
Recidivism## # A tibble: 3 x 2
## Age25 n
## * <fct> <int>
## 1 Under 25 3077
## 2 Over 25 13942
## 3 <NA> 3
<- Recidivism[complete.cases(Recidivism$Age),]
reci2 <- as_tibble(reci2)%>%group_by(Age25)%>%mutate(rec_status = Recid=="Yes")%>%summarise(mu=mean(rec_status))%>%unlist()
observed_stats <- observed_stats[3]-observed_stats[4]
observed_stat <- reci2$Recid=="Yes"
realized_data <- as_tibble(reci2)%>%group_by(Age25)%>%summarise(n=n())%>%unlist()
counts <- counts[3:4]
classes <- 10^3 - 1
n <- replicate(n, {
perms <- sample(realized_data, sum(classes))
x mean(x[1:classes[1]])-mean(x[(classes[1]+1):sum(classes)])
})<- (perms>observed_stat | perms < -observed_stat)
results <- c(TRUE, results)
results <- mean(results)
pval print(pval)
## [1] 0.001
The fact that the pval is 0.001, might warrant a case to reject null hypothesis.
4.2.5 Things to keep in mind
- Samples are not unique: In the sampling implementation, it is computationally expensive to implement sampling without replacement. Generating all the unique samples is too expensive
- Add one to Num and Den: One can add 1 to the num and denominator to take in to consideration that the original data as the extra resample
- You should always perform a two sided hypothesis test, unless there is a particular reason to always perform a one sided test
- Permutation procedures give a lot of flexibility. The basic procedure works for any test statistic. This means you can take a robust statistic of the sample and work with it
- Assumptions
- The permutation test makes no assumption about the distributional assumptions on the two populations
- It is more robust to situations when the two populations under consideration are from two different populations
- Why is the method called permutation? It is often the case that the class variable column is kept constant and the value column is permuted across rows.
- Important to keep in mind whether to do a matched pair test or independent test