Changes

Jump to navigation Jump to search
306 bytes added ,  12:18, 5 May 2011
no edit summary
Line 10: Line 10:     
===Limitations of null hypothesis tests===
 
===Limitations of null hypothesis tests===
One considerable limitation of hypothesis testing is described above: namely, hypothesis tests do not relate to the main question of interest (whether or not there is a true difference in the population), and only provide degrees of evidence in favour or against there being no true difference. Another limitation is that there will always be a difference of some magnitude between the two groups, even if this is of no relevance. Consider a cohort study where 1 million nondiseased individuals are followed up to see whether or not exposure to substance x is associated with disease. It may be that in this whole population of 1 million animals, 10.0% of exposed individuals develop the disease and that 9.9% of unexposed individuals develop the disease. Of course, this difference is not of any '''biological relevance''', and yet there is a difference there (as this is a whole population rather than a sample, we would not conduct a hypothesis test). As the size of any sample increases, the ability to detect a true difference increases. As there will be a 'true difference' (however small) in most populations, this means that hypothesis tests on large sample sizes will tend to give low p-values (indeed, some statisticians view hypothesis testing as a method of determining whether or not the sample size is sufficient to detect a difference). This problem can be reduced by ensuring that the appropriate measure of effect is always presented along with the hypothesis test p-value.
+
One considerable limitation of hypothesis testing is described above: namely, hypothesis tests do not relate to the main question of interest (whether or not there is a true difference in the population), and only provide degrees of evidence in favour or against there being no true difference. Another limitation is that there will always be a difference of some magnitude between the two groups, even if this is of no relevance. Consider a cohort study where 1 million nondiseased individuals are followed up to see whether or not exposure to substance x is associated with disease. It may be that in this whole population of 1 million animals, 10.0% of exposed individuals develop the disease and that 9.9% of unexposed individuals develop the disease. Of course, this difference is not of any '''biological relevance''', and yet there is a difference there (as this is a whole population rather than a sample, we would not conduct a hypothesis test). As the size of any sample increases, the ability to detect a true difference increases. As there will be a 'true difference' (however small) in most populations, this means that hypothesis tests on large sample sizes will tend to give low p-values (indeed, some statisticians view hypothesis testing as a method of determining whether or not the sample size is sufficient to detect a difference). This problem can be reduced by ensuring that the appropriate measure of effect is always presented along with the hypothesis test p-value. In the example above, the incidence risk of disease amongst exposed individuals was 0.100, and that amongst unexposed was 0.099, giving a risk ratio of 0.100/0.099 = 1.01. Therefore, regardless of the result of hypothesis testing, there is very little association between exposure and disease in this case.
    +
===Errors in hypothesis testing===
 +
In any hypothesis test, there is a risk that the incorrect conclusion is made - which will either take the form of a type I or a type II error, as described below. Note that no single hypothesis test can be affected by both type I and type II errors, as they are each based on different assumptions regarding the source population. However, as the true state of the source population will not be known,  both types of errors should be considered when interpreting a hypothesis test (and when calculating the required [[Sampling strategies#Sample size calculation|sample size]]).
 +
 +
====Type I error====
 +
This type of error refers to the situation where it is concluded that a difference between the two groups exists, when in fact it does not. The probability of a type I error is often denoted with the symbol α. As this type of error is based on a situation in which the 'null hypothesis' is correct, it is associated with the p-value given in a hypothesis test, which is often set at 0.05 to indicate 'significance'. This means that there is a 5% chance of a type I error (which in the case of hypothesis testing, is interpreted as 'if the null hypothesis was correct, we would expect to see this difference or greater only 5% of the time - meaning that there is [weak] evidence against the null hypothesis being correct).
 +
 +
====Type II error====
 +
This type of error refers to the situation where it is concluded that no difference between two groups exists, when in fact it does. The probability of a type II error is often denoted with the symbol β. The 'power' of a study is defined as the probability of detecting a difference when it does exist, and so can be calculated as (1-β).
   −
==Hypothesis testing and study power==
      
===Approach to hypothesis testing===
 
===Approach to hypothesis testing===
 
The approach to hypothesis testing first requires making the assumption that there is ''no difference'' between the two groups (which is known as the '''null hypothesis'''). Statistical methods are then employed in order to evaluate the probability that the observed data would be seen if the null hypothesis was correct (known as the '''p-value'''). Based on the resultant p-value, a decision can be made as to whether the support for the null hypothesis is sufficiently low so as to give evidence against it being correct. It is important to note, however, that the null hypothesis can never be completely disproved based on a sample - only evidence can be gained in support or against it. However, based on this evidence, investigators will often come to a conclusion that the null hypothesis is either 'accepted' or 'rejected'.<br>
 
The approach to hypothesis testing first requires making the assumption that there is ''no difference'' between the two groups (which is known as the '''null hypothesis'''). Statistical methods are then employed in order to evaluate the probability that the observed data would be seen if the null hypothesis was correct (known as the '''p-value'''). Based on the resultant p-value, a decision can be made as to whether the support for the null hypothesis is sufficiently low so as to give evidence against it being correct. It is important to note, however, that the null hypothesis can never be completely disproved based on a sample - only evidence can be gained in support or against it. However, based on this evidence, investigators will often come to a conclusion that the null hypothesis is either 'accepted' or 'rejected'.<br>
   −
In any hypothesis test, there is a risk that the incorrect conclusion is made - which will either take the form of a type I or a type II error, as described below. Note that no single hypothesis test can be affected by both type I and type II errors, as they are each based on different assumptions regarding the source population. However, as the true state of the source population will not be known,  both types of errors should be considered when interpreting a hypothesis test (and when calculating the required [[Sampling strategies#Sample size calculation|sample size]]).
  −
  −
===Type I error===
  −
This type of error refers to the situation where it is concluded that a difference between the two groups exists, when in fact it does not. The probability of a type I error is often denoted with the symbol α. As this type of error is based on a situation in which the 'null hypothesis' is correct, it is associated with the p-value given in a hypothesis test, which is often set at 0.05 to indicate 'significance'. This means that there is a 5% chance of a type I error (which in the case of hypothesis testing, is interpreted as 'if the null hypothesis was correct, we would expect to see this difference or greater only 5% of the time - meaning that there is [weak] evidence against the null hypothesis being correct).
  −
  −
===Type II error===
  −
This type of error refers to the situation where it is concluded that no difference between two groups exists, when in fact it does. The probability of a type II error is often denoted with the symbol β. As the 'power' of a study is defined as the probability of detecting the difference when it does exist, it can be calculated as (1-β).
      
[[Category:Veterinary Epidemiology - Statistical Methods|E]]
 
[[Category:Veterinary Epidemiology - Statistical Methods|E]]
700

edits

Navigation menu