36 Measures of diagnostic test accuracy
When we have finished this Chapter, we should be able to:
36.1 Research questions
To estimate the diagnostic accuracy of digital mammography (index test) in the detection of breast cancer, using histopathology as a “gold standard” in women aged over 40 years, who are undergoing mammography for the evaluation of different symptoms related to breast diseases.
To estimate the post-test probability of breast cancer when the digital mammography is positive or negative given a pre-test probability.
36.2 Packages we need
We need to load the following packages:
36.3 Contingency 2x2 table
Generally, an individual’s disease status is a dichotomous variable; the individual either has the disease (\(Outcome+\)) or hasn’t the disease (\(Outcome-\)) as defined by the reference standard (or “gold” standard). The diagnostic test under evaluation (also named as index test) can be measured with a dichotomous variable (e.g. presence or absence of breast abnormalities using an X-ray) or a continuous variable (e.g. fasting glucose level for diabetes diagnosis), that can be transformed to a dichotomous variable by choosing an optimal cut-off value (threshold) which distinguishes positive (\(Test+\)) from negative (\(Test-\)) test results.
If the index test gives a dichotomous result for each participant in a study, the data can be tabulated in a 2 x 2 table of test result (\(Test+\), \(Test-\)) versus “true” disease status (\(Outcome+\), \(Outcome-\)). For example, the result of digital mammography test to diagnose breast cancer compared to the “gold” standard biopsy/surgery and histopathology in 1220 women with suspected breast cancer are following:
Outcome according to the reference standard | ||||
---|---|---|---|---|
\(Outcome+\) (Disease present) |
\(Outcome-\) (Disease absent) |
Totals | ||
Index Test result | \(Test+\) | TP=890 | FP=110 | TP+FP=1000 |
\(Test-\) | FN=20 | TN=200 | TN+FN=220 | |
Totals | TP+FN=910 | TN+FP=310 |
N=1220 (TP+TN+FP+FN) |
where
TP: true positive; test positive and disease present (Test+ ∩ Outcome+)
FP = false positive; test positive and disease absent (Test+ ∩ Outcome-)
FN = false negative; test negative and disease present (Test- ∩ Outcome+)
TN = true negative; test negative and disease absent (Test- ∩ Outcome-)
36.4 Diagnostic Accuracy Measures
Basic Diagnostic Accuracy Measures
The Sensitivity (Se) of a diagnostic test refers to the ability of the test to correctly identify those individuals with the disease. It is defined as the proportion of true positive test results among individuals who have the disease.
Se = \(\frac{TP}{TP+FN}=\frac{890}{910}=0.978\) or \(97.8\%\)
The Specificity (Sp) of a diagnostic test refers to the ability of the test to correctly identify those patients without the disease. It is defined as the proportion of true negative test results among individuals who do not have the disease.
Sp = \(\frac{TN}{TN+FP}=\frac{200}{310}=0.645\) or \(64.5\%\)
Positive Predictive Value (PPV) is the probability that individuals with a positive diagnostic test result actually have the disease. It is defined as the proportion of true positive test results among individuals who have a positive test.
PPV = \(\frac{TP}{TP+FP}=\frac{890}{1000}=0.890\) or \(89.0\%\)
Negative Predictive Value (NPV) is the probability that individuals with a negative diagnostic test result are truly free from the disease. It is defined as the proportion of true negative test results among individuals who have a negative test.
NPV = \(\frac{TN}{TN+FN}=\frac{200}{220}=0.909\) or \(90.9\%\)
Positive and negative predictive values are influenced by the prevalence of disease in the population that is being tested. Using the same test in a population with a higher prevalence (e.g. women over the age of 55) increases positive predictive value. Conversely, increased prevalence results in decreased negative predictive value. Therefore, when considering predictive values of diagnostic or screening tests, we should take into account the influence of the prevalence of the disease.
And other useful diagnostic measures are following:
More Diagnostic Accuracy Measures
Apparent prevalence is the proportion of individuals with a positive test result.
Apparent prevalence = \(\frac{TP + FP}{N}=\frac{1000}{1220}=0.820\) or \(82.0\%\)
True prevalence is the proportion of individuals that are truly diseased.
True prevalence = \(\frac{TP + FN}{N}=\frac{910}{1220}=0.746\) or \(74.6\%\)
The Likelihood ratio for a positive test result (LR+) is the likelihood (probability) of an individual who has the disease testing positive divided by the likelihood (probability) of an individual who does not have the disease testing positive. It is calculated as sensitivity divided by 1 minus the specificity value.
LR+ = \(\frac{Se}{1-Sp}=\frac{0.978}{1-0.645}=\frac{0.978}{0.355}= 2.755\)
The Likelihood ratio for a negative test result (LR-) is the probability of an individual who has the disease testing negative divided by the probability of an individual who does not have the disease testing negative. It is calculated as 1 minus the sensitivity divided by specificity value.
LR- = \(\frac{1-Se}{Sp}=\frac{1-0.978}{0.645}=\frac{0.022}{0.645}= 0.034\)
Diagnostic accuracy (effectiveness), expressed as a proportion of correctly classified subjects (TP+TN) among all subjects (N). Diagnostic accuracy is affected by the disease prevalence.
Accuracy = \(\frac{TP + TN}{N}=\frac{890 + 200}{1220}=\frac{1090}{1220}= 0.893\) or \(89.3\%\)
In R:
tb1 <- as.table(
rbind(c(890, 110), c(20, 200))
)
dimnames(tb1) <- list(
Test = c("Test +", "Test -"),
Outcome = c("Outcome +", "Outcome -")
)
tb1
Outcome
Test Outcome + Outcome -
Test + 890 110
Test - 20 200
From this contingency table, we can create a basic mosaic plot.
epi.tests(tb1, digits = 3)
Outcome + Outcome - Total
Test + 890 110 1000
Test - 20 200 220
Total 910 310 1220
Point estimates and 95% CIs:
--------------------------------------------------------------
Apparent prevalence * 0.820 (0.797, 0.841)
True prevalence * 0.746 (0.720, 0.770)
Sensitivity * 0.978 (0.966, 0.987)
Specificity * 0.645 (0.589, 0.698)
Positive predictive value * 0.890 (0.869, 0.909)
Negative predictive value * 0.909 (0.863, 0.944)
Positive likelihood ratio 2.756 (2.371, 3.204)
Negative likelihood ratio 0.034 (0.022, 0.053)
False T+ proportion for true D- * 0.355 (0.302, 0.411)
False T- proportion for true D+ * 0.022 (0.013, 0.034)
False T+ proportion for T+ * 0.110 (0.091, 0.131)
False T- proportion for T- * 0.091 (0.056, 0.137)
Correctly classified proportion * 0.893 (0.875, 0.910)
--------------------------------------------------------------
* Exact CIs
Alternatively, we can obtain the same results in R using the diag_test2()
function from the {pubh}
package:
diag_test2(890, 110, 20, 200)
36.5 Likelihood ratios in practice
36.5.1 Interpretation of LRs
Likelihood ratio (LR), along with sensitivity and specificity, can be considered properties of the test itself that do not change with the prevalence of disease.
In our example, LR+ = 2.756, meaning a positive result in digital mammography is approximately 2.8 times more likely to be a true positive test than a false positive test.
Similarly, LR- = 0.034, meaning a negative result in digital mammography is approximately 0.034 times more likely to be a false negative test than a true negative test. This can also be interpreted as: A woman without breast cancer is about 29.4 (= 1/0.034) times more likely to have a negative digital mammography test than a woman with breast cancer.
In clinical practice, a higher LR+ is desirable for tests used to “rule in” a disease, while a lower LR- is preferred for tests used to “rule out” the chance that the individual has the disease.
36.5.2 Application of LRs (revising the probability of disease)
The LR is commonly used in decision-making based on Bayes’ Theorem (Chapter 13). The pre-test odds of a particular diagnosis, multiplied by the likelihood ratio of the diagnostic test, determines the post-test odds.
\(post \text{-} test \ odds = likelihood \ ratio \times pre \text{-} test \ odds\)
These post-test odds provides an updated estimate of the odds that the patient has the condition or disease after taking into account the diagnostic test result. If the test result is positive, we use the LR+ for this calculation. If the test result is negative, we use the LR- instead. In both scenarios, the odds refer to the odds in favor of the disease being present.
Example: LR+
LR+ tells us how much the odds of the condition or disease being present increase given a positive test result.
LR+ greater than 1: Increases the post-test odds.
LR+ of 1: No change in the post-test odds (post-test odds equals pre-test odds).
Now, let’s suppose that, based on family history of breast cancer and clinical symptoms, a woman has 0.78 probability for breast cancer (pre-test probability).
We are interested in the the post-test probability of breast cancer when the digital mammography is positive. It is important to note that “odds” and “probability” are not the same; however, they can be derived from each other as follows:
\(pre \text{-} test \ odds = \frac{pre \text{-} test \ probability}{1 - pre \text{-} test \ probability} = \frac{0.78}{1-0.78} = \frac{0.78}{0.22} = 3.55\)
\(post \text{-} test \ odds = LR\text{+} \times pre \text{-} test \ odds = 2.756 \times 3.55 = 9.78\)
\(post \text{-} test \ probability = \frac{post \text{-} test \ odds}{post \text{-} test \ odds + 1} = \frac{9.78}{9.78 + 1} = \frac{9.78}{10.78} = 0.9072 \ or \ 90.72\%\)
The Fagan nomogram allows us to turn pre-test probabilities into post-test probabilities without needing to convert into odds. The nomogram typically consists of three parallel scales representing the pre-test probability, the likelihood ratio, and the post-test probability. We can visually estimate the post-test probability of a positive diagnosis result by drawing a line from the known pre-test probability, through the LR+ and read off the post-test probability.
In R:
fagan.plot(probs.pre.test = 0.78, LR = 2.756)
Example: LR-
LR- tells us how much the odds of the condition or disease being present decrease given a negative test result.
LR- less than 1: Decreases the post-test odds.
LR- of 1: No change in the probability (post-test odds equals pre-test odds).
What is the the post-test probability of breast cancer for the same population when the digital mammography is negative?
\(post \text{-} test \ odds = LR\text{-} \times pre \text{-} test \ odds = 0.034 \times 3.55 = 0.1207\)
\(post \text{-} test \ probability = \frac{post \text{-} test \ odds}{post \text{-} test \ odds + 1} = \frac{0.1207}{0.1207 + 1} = \frac{0.1207}{1.1207} = 0.1077 \ or \ 10.77\%\)
In R:
fagan.plot(probs.pre.test = 0.78, LR = 0.034)
We can also use the epi.nomogram()
and nomogrammer()
functions which allow us to calculate the post-test probability and plot Fagan’s nomgrams with ggplot2, respectively:
epi.nomogram(se = NA, sp = NA, lr = c(2.755, 0.034), pre.pos = 0.78,
verbose = FALSE)
Given a positive test result, the post-test probability of being outcome positive is 0.91
Given a negative test result, the post-test probability of being outcome positive is 0.11
# nomogrammer is a standalone function from the below repository
source("https://raw.githubusercontent.com/achekroud/nomogrammer/master/nomogrammer.r")
nomogrammer(Prevalence = 0.78,
Plr = 2.755,
Nlr = 0.034,
Detail = TRUE,
NullLine = TRUE)