Changes

Jump to navigation Jump to search
3,180 bytes added ,  07:37, 5 May 2011
no edit summary
Line 13: Line 13:     
===Study sample===
 
===Study sample===
The sample population includes those animals which are included in the final study. It is important to remember that in most epidemiological studies, we are not interested in this population ''per se'' - rather, we are interested in using this sample in order to make statements regarding the source population (and possibly the target population). Because not all members of the source population have been sampled, statistical techniques need to be applied to the results from the study group in order to estimate what the characteristics of the source population are expected to be. Due to this extrapolation, there is always a possibility that any estimates from a sample are incorrect due to [[Random variation|'''random variation''']] in the sample. Although this random variation cannot be controlled without increasing the sample size (or redefining the source population), the accuracy of the estimate can be maximised by ensuring that sources of [[Bias|'''bias''']] are minimised.  
+
The sample population includes those animals which are included in the final study. It is important to remember that in most epidemiological studies, we are not interested in this population ''per se'' - rather, we are interested in using this sample in order to make statements regarding the source population (and possibly the target population). Because not all members of the source population have been sampled, statistical techniques need to be applied to the results from the study group in order to estimate what the characteristics of the source population are expected to be. Due to this extrapolation, there is always a possibility that any estimates from a sample are incorrect due to [[Random error|'''random variation''']] in the sample. Although this random variation cannot be controlled without increasing the sample size (or redefining the source population), the accuracy of the estimate can be maximised by ensuring that sources of [[Bias|'''bias''']] are minimised.  
    
==Approaches to sampling==
 
==Approaches to sampling==
Line 48: Line 48:     
==Sample size calculation==
 
==Sample size calculation==
As mentioned earlier, it is important in any study not only that bias is minimised, but that the sample has sufficient [[Random variation#Precision|precision]] and [[Random variation#Hypothesis testing and study power|power]] (in the case of analytic studies) to answer the question(s) for which the study is intended. Both of these are closely related to the [[Random variation|random variability]] in any sample taken from a population. Although this can be reduced by increasing the sample size, a number of other considerations (usually logistical and economic considerations) will also be acting in order to reduce the number of samples which can realistically be taken. Statistical techniques are therefore available in order to calculate the required sample size. Counterintuitively, these require assumptions to be made regarding the final results of the study, as well as information regarding the required [[Random variation#Confidence intervals|level of confidence]], precision or power of the study.
+
As mentioned earlier, it is important in any study not only that bias is minimised, but that the sample has sufficient [[Random error#Precision|precision]] and [[Random error#Hypothesis testing and study power|power]] (in the case of analytic studies) to answer the question(s) for which the study is intended. Both of these are closely related to the random variability in any sample taken from a population. Although this can be reduced by increasing the sample size, a number of other considerations (usually logistical and economic considerations) will also be acting in order to reduce the number of samples which can realistically be taken. Statistical techniques are therefore available in order to calculate the required sample size. Counterintuitively, these require assumptions to be made regarding the final results of the study, as well as information regarding the required [[Random error#Confidence intervals|level of confidence]], precision or power of the study. Sample size formulae are not given here, but can be found in most statistical textbooks.
    
===Expected variation in the data===
 
===Expected variation in the data===
Line 54: Line 54:     
===Required precision===
 
===Required precision===
 +
In the case of descriptive studies, this relates to the width of the 95% confidence interval. For example, you may want to estimate the seroprevalence to Bluetongue virus to within ±10% of the true population seroprevalence, or you may want to estimate the mean skin thickness of a group of cattle following tuberculin testing to within ±1mm of the true population mean. The concept of precision is also used in analytic studies, in the form of the difference between groups which you wish to detect. As this is closely associated with power calculations, it is mentioned in the 'power' section below.
    +
===Level of confidence===
 +
This is used in descriptive studies in order to indicate the level of confidence that the confidence interval of the estimate produced will contain the true population value. Usually, a confidence level of 95% is used. The level of confidence is also The concept of confidence intervals is explained further in the section on [[Random error|random variation]].
   −
===Level of confidence===
+
===Power===
 +
This relates to the ability to detect a difference in a parameter of interest between two groups, and so relates to analytic studies. The power indicates the probability that a study will detect a 'significant' difference between groups (using a specified p-value [usually 0.05] to indicate significance), assuming that a difference of a specified size does exist. For example, if there is a true difference in mean annual milk yield of 500 litres between two groups of cows, a study with a power of 80% will detect a statistically significant difference 80% of the time. That is, if the same study was repeated again and again, selecting the calculated required number of cows from each herd, 80% of these studies would detect a difference between groups and 20% would not.
 +
 
 +
===Clustering===
 +
When cluster or multistage sampling techniques are used, the effect of clustering of the outcome of interest within clusters will have an effect on the required sample size, since animals within the same cluster would be expected to be more similar to each other than to those from other clusters. Therefore, formulas are available in order to calculate the 'design effect' (or DEFF), which indicates the factor by which the calculated sample size needs to be increased by in order to account for this.
 +
 
 +
===Sampling fraction===
 +
This relates to the proportion of the total target population which is sampled. In most epidemiological studies, samples are collected '''without replacement''' (i.e. an individual animal cannot be selected twice), although many of the calculations used are based on the concept of sampling with replacement. This does not cause a problem when (as in most cases), the sampling fraction is low (less than about 5%, when expressed as a percentage). However, if the sampling fraction is high, a correction known as the '''finite population correction''' should be made to account for this in the calculation of the required sample size (and in the final estimates).
 +
 
 +
===Multivariable studies===
 +
When the effect of confounding or interaction is to be accounted for in the study, the sample size needs to be increased accordingly.
      −
[[Category:Veterinary Epidemiology - Introduction|F]]
+
[[Category:Veterinary Epidemiology - General Concepts|G]]
700

edits

Navigation menu