Hypothesis testing
Null hypothesis testing (often described just as hypothesis testing is very commonly used in epidemiological investigations, and may be used in both analytic studies (for example, assessing whether disease experience differs between different exposure groups), and in descriptive studies (for example, if assessing whether disease experience differs from some suspected value). As in most studies, only a sample of individuals is taken, it is not possible to definitively state whether or not there is a difference between the two exposure groups. Hypothesis tests provide a method of assessing the strength of evidence in favour or against a true difference in the underlying population. However, despite their widespread use, the results of hypothesis tests are often misinterpreted.
Concept behind null hypothesis testing
Hypothesis tests provide a systematic, objective method of data analysis, but do not actually answer the main question of interest (which is commonly along the lines of 'is there a difference in disease experience between individuals with exposure x and individuals without exposure x?'). Rather, hypothesis tests answer the question 'if there is no difference in disease experience between individuals with or without exposure x, what is the probability of obtaining the current data (or data more 'extreme' than this)?' As such, hypothesis tests do not inform us whether or not there is a difference, but instead they offer us varying degrees of evidence in support of or against a situation where there is no difference in the population under investigation. This situation of 'no difference' is known as the 'null hypothesis', and should be stated whenever a null hypothesis test is performed. If there is little evidence against the
Hypothesis testing and study power
Approach to hypothesis testing
The approach to hypothesis testing first requires making the assumption that there is no difference between the two groups (which is known as the null hypothesis). Statistical methods are then employed in order to evaluate the probability that the observed data would be seen if the null hypothesis was correct (known as the p-value). Based on the resultant p-value, a decision can be made as to whether the support for the null hypothesis is sufficiently low so as to give evidence against it being correct. It is important to note, however, that the null hypothesis can never be completely disproved based on a sample - only evidence can be gained in support or against it. However, based on this evidence, investigators will often come to a conclusion that the null hypothesis is either 'accepted' or 'rejected'.
In any hypothesis test, there is a risk that the incorrect conclusion is made - which will either take the form of a type I or a type II error, as described below. Note that no single hypothesis test can be affected by both type I and type II errors, as they are each based on different assumptions regarding the source population. However, as the true state of the source population will not be known, both types of errors should be considered when interpreting a hypothesis test (and when calculating the required sample size).
Type I error
This type of error refers to the situation where it is concluded that a difference between the two groups exists, when in fact it does not. The probability of a type I error is often denoted with the symbol α. As this type of error is based on a situation in which the 'null hypothesis' is correct, it is associated with the p-value given in a hypothesis test, which is often set at 0.05 to indicate 'significance'. This means that there is a 5% chance of a type I error (which in the case of hypothesis testing, is interpreted as 'if the null hypothesis was correct, we would expect to see this difference or greater only 5% of the time - meaning that there is [weak] evidence against the null hypothesis being correct).
Type II error
This type of error refers to the situation where it is concluded that no difference between two groups exists, when in fact it does. The probability of a type II error is often denoted with the symbol β. As the 'power' of a study is defined as the probability of detecting the difference when it does exist, it can be calculated as (1-β).