32 McNemar’s test
The McNemar’s test (also known as the paired or matched chi-square) is used to determine if there are differences on a dichotomous dependent variable between two related groups. It can be considered to be similar to the paired-samples t-test, but for a dichotomous rather than a continuous dependent variable. The McNemar’s test is used to analyze pretest-posttest study designs (observing categorical outcomes more than once in the same patient), as well as being commonly employed in analyzing matched pairs and case-control studies.
When we have finished this Chapter, we should be able to:
32.1 Research question and Hypothesis Testing
We consider the data in asthma dataset. The dataset contains data from a survey of 86 children with asthma who attended a camp to learn how to self-manage their asthmatic episodes. The children were asked whether they knew (yes or not) how to manage their asthmatic episodes appropriately at both the start and completion of the camp.
In other words, was a significant change in children’s knowledge of asthma management between the beginning and completion of the health camp?
32.2 Packages we need
We need to load the following packages:
32.3 Preparing the data
We import the data asthma in R:
library(readxl)
asthma <- read_excel(here("data", "asthma.xlsx"))
We inspect the data and the type of variables:
glimpse(asthma)
Rows: 86
Columns: 2
$ know_begin <chr> "yes", "no", "yes", "no", "no", "no", "yes", "no", "no", "y…
$ know_end <chr> "yes", "no", "no", "no", "no", "no", "yes", "yes", "yes", "…
The dataset asthma includes 86 children with asthma (rows) and 2 columns, the character (<chr>
) know_begin
and the character (<chr>
) know_end
. Therefore, we consider the dichotomous dependent variable asthma knowledge (yes/no) between two time points, know_begin
and know_end
.
Both measurements know_begin
and know_end
should be converted to factors (<fct>
) using the convert_as_factor()
function as follows:
asthma <- asthma %>%
convert_as_factor(know_begin, know_end)
glimpse(asthma)
Rows: 86
Columns: 2
$ know_begin <fct> yes, no, yes, no, no, no, yes, no, no, yes, no, no, yes, ye…
$ know_end <fct> yes, no, no, no, no, no, yes, yes, yes, yes, yes, no, yes, …
32.4 Contigency table
We can obtain the cross-tabulation table of the two measurements for the children’s knowledge of asthma:
tb3 <- table(know_begin = asthma$know_begin, know_end = asthma$know_end)
tb3
know_end
know_begin no yes
no 27 29
yes 6 24
There is a basic difference between this table and the more common two-way table. In this case, the count represents the number of pairs, not the number of individuals.
We want to compare the proportion of children’s knowledge of asthma management at the beginning with the proportion of children’s knowledge of asthma management at the end. We can create a more informative table using the functions from janitor package for obtaining total percentages and marginal totals.
We can create an informative table using the functions from janitor package for obtaining total percentages and marginal totals:
total_tb2 <- asthma %>%
tabyl(know_begin, know_end) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages("all") %>%
adorn_pct_formatting(digits = 1) %>%
adorn_ns %>%
adorn_title
knitr::kable(total_tb2)
know_end | |||
---|---|---|---|
know_begin | no | yes | Total |
no | 31.4% (27) | 33.7% (29) | 65.1% (56) |
yes | 7.0% (6) | 27.9% (24) | 34.9% (30) |
Total | 38.4% (33) | 61.6% (53) | 100.0% (86) |
The contingency table using the datasummary_crosstab()
from the modelsummary package:
modelsummary::datasummary_crosstab(know_begin ~ know_end,
statistic = 1 ~ 1 + N + Percent(),
data = asthma)
know_begin | no | yes | All | |
---|---|---|---|---|
no | N | 27 | 29 | 56 |
% | 31.4 | 33.7 | 65.1 | |
yes | N | 6 | 24 | 30 |
% | 7.0 | 27.9 | 34.9 | |
All | N | 33 | 53 | 86 |
% | 38.4 | 61.6 | 100.0 |
The proportion of children who knew to manage asthma at the beginning is (6+24)/86= 30/86 = 0.349 or 34.9%. The proportion of children who knew to mange asthma at the end is (29+24)/86 = 53/86 = 0.616 or 61.6%.
32.5 Run McNemar’s test
Finally, we run the McNemar’s test:
mcnemar.test(tb3)
McNemar's Chi-squared test with continuity correction
data: tb3
McNemar's chi-squared = 13.829, df = 1, p-value = 0.0002003
mcnemar_test(tb3)
# A tibble: 1 × 6
n statistic df p p.signif method
* <int> <dbl> <dbl> <dbl> <chr> <chr>
1 86 13.8 1 0.0002 *** McNemar test
The proportion of children who knew to manage asthma at the end (61.6%) is significant larger compared with the proportion of children who knew to manage asthma at the beginning (34.9%) (p-value <0.001).
32.6 Exact binomial test
Exact binomial test for 2x2 table when the sum of the discordant cells are less than 25:
mcnemar.exact(tb3)
Exact McNemar test (with central confidence intervals)
data: tb3
b = 29, c = 6, p-value = 0.0001168
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
1.971783 14.238838
sample estimates:
odds ratio
4.833333