Null hypothesis testing (often described just as hypothesis testing is very commonly used in epidemiological investigations, and may be used in both analytic studies (for example, assessing whether disease experience differs between different exposure groups), and in descriptive studies (for example, if assessing whether disease experience differs from some suspected value). However, despite its widespread use, the results of hypothesis tests are often misinterpreted.
Hypothesis testing and study power
Approach to hypothesis testing
Hypothesis testing is commonly used in analytic epidemiological studies, and provides a systematic method of analysing data in order to draw conclusions about the population(s) from which these data were drawn. A common example is when samples are taken from two different populations and a test is performed in order to assess whether some outcome of interest (such as prevalence of infection) differs between the two groups. As it is not possible to definitively state whether or not there is a difference between these two groups (since not all members of the groups are sampled), statistical methods are used to assess the strength of evidence in favour or against a difference.
The approach to hypothesis testing first requires making the assumption that there is no difference between the two groups (which is known as the null hypothesis). Statistical methods are then employed in order to evaluate the probability that the observed data would be seen if the null hypothesis was correct (known as the p-value). Based on the resultant p-value, a decision can be made as to whether the support for the null hypothesis is sufficiently low so as to give evidence against it being correct. It is important to note, however, that the null hypothesis can never be completely disproved based on a sample - only evidence can be gained in support or against it. However, based on this evidence, investigators will often come to a conclusion that the null hypothesis is either 'accepted' or 'rejected'.
In any hypothesis test, there is a risk that the incorrect conclusion is made - which will either take the form of a type I or a type II error, as described below. Note that no single hypothesis test can be affected by both type I and type II errors, as they are each based on different assumptions regarding the source population. However, as the true state of the source population will not be known, both types of errors should be considered when interpreting a hypothesis test (and when calculating the required sample size).
Type I error
This type of error refers to the situation where it is concluded that a difference between the two groups exists, when in fact it does not. The probability of a type I error is often denoted with the symbol α. As this type of error is based on a situation in which the 'null hypothesis' is correct, it is associated with the p-value given in a hypothesis test, which is often set at 0.05 to indicate 'significance'. This means that there is a 5% chance of a type I error (which in the case of hypothesis testing, is interpreted as 'if the null hypothesis was correct, we would expect to see this difference or greater only 5% of the time - meaning that there is [weak] evidence against the null hypothesis being correct).
Type II error
This type of error refers to the situation where it is concluded that no difference between two groups exists, when in fact it does. The probability of a type II error is often denoted with the symbol β. As the 'power' of a study is defined as the probability of detecting the difference when it does exist, it can be calculated as (1-β).