Difference between revisions of "Measures of strength of association"

From WikiVet English
Jump to navigation Jump to search
(5 intermediate revisions by one other user not shown)
Line 2: Line 2:
  
 
==Correlation coefficients==
 
==Correlation coefficients==
Correlation coefficients are used when comparing two [[Data types#Quantitative data|quantitative variables]], and are based upon the '''covariance''' between these variables amongst the individuals in the study population. Strictly speaking, the covariance is a measure of how two variables differ in individuals in relation to their mean values in the whole population - or put more simply, it is a measure of how the variables change in relation ''to each other''. As the magnitude of the covariance will depend upon the magnitudes of the variables in question, this value is 'standardised' in order to give a correlation coefficient, which lies between -1 (indicating a perfect negative correlation) and +1 (indicating a perfect positive correlation). A coefficient of 0 indicates no correlation, and therefore correlation coefficients are a useful measure of the strength of association between quantitative variables.
+
Correlation coefficients are used when comparing two [[Data types#Quantitative data|quantitative variables]], and are based upon the '''covariance''' between these variables amongst the individuals in the study population. The covariance can be viewed as how the two variables of interest differ in individuals in relation to their mean values in the whole population, but put more simply, is a measure of how two different variables change in relation ''to each other''. As the magnitude of this variable will depend upon the magnitudes of the variables in question, this value is 'standardised' in order to give a correlation coefficient, which lies between -1 (indicating a perfect negative correlation) and +1 (indicating a perfect positive correlation), with a coefficient of 0 indicating no correlation. Therefore, correlation coefficients measure how closely associated the two variables of interest are to each other.
  
 
==Ratio measures==
 
==Ratio measures==
 
Although correlation coefficients are commonly used in statistical studies, epidemiological investigations often deal with binary exposures and outcomes (such as presence or absence of a proposed risk factor for disease, and presence or absence of disease itself). Therefore, '''ratio measures''' such as the '''prevalence ratio''', the '''risk ratio''', the '''rate ratio''' and the '''odds ratio''' are commonly used as measures of strength of association in epidemiological studies.<br>
 
Although correlation coefficients are commonly used in statistical studies, epidemiological investigations often deal with binary exposures and outcomes (such as presence or absence of a proposed risk factor for disease, and presence or absence of disease itself). Therefore, '''ratio measures''' such as the '''prevalence ratio''', the '''risk ratio''', the '''rate ratio''' and the '''odds ratio''' are commonly used as measures of strength of association in epidemiological studies.<br>
  
Understanding how these measures are calculated is best approached using a '''contingency table''' (also known as a '''cross tabulation'''), as shown below. In this table, the columns divide all individuals into ''exposed'' and ''unexposed'', whilst the rows divide individuals into those who are ''diseased'' and those who are ''not diseased''. Therefore, cell 'm<sub>1</sub>' represents all diseased individuals, cell 'n<sub>1</sub>' represents all exposed individuals, and cell 'a<sub>1</sub>' represents exposed individuals who are also diseased.
+
Understanding how these measures are calculated is best approached using a contingency table (also known as a cross tabulation), as shown below. In the columns, the individuals are divided into exposed and unexposed, whilst in the rows, individuals are divided into those who are diseased and those who are not diseased. Therefore, cell 'm<sub>1</sub>' represents all diseased individuals, cell 'n<sub>1</sub>' represents all exposed individuals, and cell 'a<sub>1</sub>' represents exposed individuals who are also diseased.
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 20: Line 20:
 
|}
 
|}
  
The measures of disease frequency which can be extracted from this table will depend on the [[Study design|study design]] used (which will be [[Study design#Analytic studies|analytic]] in nature, as data regarding exposure have been collected).<br>
+
The measures of disease frequency which could be extracted from this table will depend on the [[Study design|study design]] used, which will be [[Study design#Analytic studies|analytic]] in nature, as data regarding exposure has been collected.<br>
  
 
In the case of a [[Study design#Cross sectional studies|cross sectional study]], the '''[[Measures of disease frequency#Prevalence|prevalence]]''' can be estimated amongst exposed individuals as (a<sub>1</sub>/n<sub>1</sub>), and amongst unexposed individuals as (a<sub>0</sub>/n<sub>0</sub>).<br>
 
In the case of a [[Study design#Cross sectional studies|cross sectional study]], the '''[[Measures of disease frequency#Prevalence|prevalence]]''' can be estimated amongst exposed individuals as (a<sub>1</sub>/n<sub>1</sub>), and amongst unexposed individuals as (a<sub>0</sub>/n<sub>0</sub>).<br>
  
In the case of a [[Study design#Cohort studies|cohort study]] or a [[Study design#Experimental studies|experimental study]], the disease status of individuals will relate only to ''new'' cases of disease (i.e. those which were not diseased at the start of the study. In these cases, the '''[[Measures of disease frequency#Incidence risk|incidence risk]]''' can be estimated amongst exposed individuals as (a<sub>1</sub>/n<sub>1</sub>), and amongst unexposed individuals as (a<sub>0</sub>/n<sub>0</sub>). Alternatively, the [[Measures of disease frequency#Incidence rate|incidence rate]] can be estimated, if the total animal-time for each exposure group is known, as (a<sub>1</sub>/[total number of animal-time units in exposed group]) amongst exposed animals and (a<sub>1</sub>/[total number of animal-time units in unexposed group]) amongst unexposed animals.<br>
+
In the case of a [[Study design#Cohort studies|cohort study]], the '''[[Measures of disease frequency#Incidence risk|incidence risk]]''' can be estimated amongst exposed individuals as (a<sub>1</sub>/n<sub>1</sub>), and amongst unexposed individuals as (a<sub>0</sub>/n<sub>0</sub>). Alternatively, the [[Measures of disease frequency#Incidence rate|incidence rate]] can be estimated as (a<sub>1</sub>/[total number of animal-time units in exposed group]) amongst exposed animals and (a<sub>1</sub>/[total number of animal-time units in unexposed group]) amongst unexposed animals.<br>
  
In the case of a [[Study design#Case control studies|case control study]], no [[Measures of disease frequency|measures of disease frequency]] can be calculated, as selection of individuals was based upon their disease status. However an analytic study can still be conducted. This is achieved by looking at the '''odds of exposure''' in the different disease groups. This may seem incorrect (as we are more interested in the relative probabilities of disease amongst exposure groups than the odds of exposure amongst disease groups), but will be explained further below. The odds ratio amongst diseased individuals is calculated as (a<sub>1</sub>/a<sub>0</sub>), and amongst nondiseased individuals as (b<sub>1</sub>/b<sub>0</sub>).<br>
+
In the case of a [[Study design#Case control studies|case control study]], no [[Measures of disease frequency|measures of disease frequency]] can be calculated. However, the '''odds of exposure''' can be estimated amongst diseased individuals as (a<sub>1</sub>/a<sub>0</sub>), and amongst nondiseased individuals as (b<sub>1</sub>/b<sub>0</sub>).<br>
  
For study designs apart from case-control studies, once estimates of the prevalences, risks or rates of disease amongst different exposure groups have been calculated, the ratio of these can be calculated by dividing the estimates for the different groups with each other. In most cases, the frequency of disease amongst exposed animals is divided by the frequency of disease amongst unexposed animals (although the opposite approach can be taken if desired). Therefore, the prevalence or risk ratio can be calculated using teh following equation:<br>
 
(a<sub>1</sub>/n<sub>1</sub>) / (a<sub>0</sub>/n<sub>0</sub>)<br>
 
  
As mentioned above, the output from a case control study will be the ''odds of exposure'' amongst diseased and nondiseased animals. It can be shown that (as long as the ''[[Sampling strategies#Sampling fraction|sampling fraction]]'' is different for cases and controls), the ''exposure odds ratio'' comparing diseased to non diseased animals is identical to the ''odds ratio for disease'', comparing exposed to nonexposed animals. This is why the odds, rather than any other measure, is used in these types of studies. Although, strictly speaking, the exposure odds ratio is calculated as (a<sub>1</sub>/a<sub>0</sub>) / (b<sub>1</sub>/b<sub>0</sub>), it is often reformulated, for ease of calculation, into the following equation (known as the '''cross product ratio'''):<br>
 
(a<sub>1</sub>×b<sub>0</sub>) / (b<sub>1</sub>×a<sub>0</sub>)<br>
 
  
These ratio measures of strength of association vary from approximately 0 to +∞, with an estimate of 1 indicating no association. It should be noted that although the odds ratio for disease is a useful measure of strength of association, its value will differ from the equivalent prevalence or risk ratio, with a tendency towards more extreme (more positive in the case of prevalence/risk ratios greater than 1, or smaller in the case of prevalence/risk ratios less than 1) values when the disease under investigation is common in the population. This may not be a problem when using case control studies, as these are often used when the disease in question is rare. However, odds ratios are commonly used in more advanced statistical methods (particularly [[Logistic regression|logistic regression]] - in which case, care must be taken when interpreting odds ratios.
 
  
[[Category:Veterinary Epidemiology - Statistical Methods|C]]
 
  
==Webinars==
+
 
<rss max="10" highlight="none">https://www.thewebinarvet.com/welfare-and-ethics/webinars/feed</rss>
+
 
 +
[[Category:Veterinary Epidemiology - Statistical Methods|B]]

Revision as of 16:46, 4 May 2011

Analytic studies are conducted in an attempt to identify whether the disease experience in a population differs between groups of animals within this population (defined by exposure to 'risk factors' of interest), in the hope that some indication of a causal association can be achieved. Therefore, methods are required in order to quantify any 'evidence' in support of a possible association. Epidemiologists commonly measure this using measures of strength of association and through the use of null hypothesis tests. It is important to use both of these measures whenever interpreting the results of an analytic study, as they measure different things. Measures of strength of association are an indication of the magnitude of the association, whereas the hypothesis test results give an indication of the probability of seeing the data obtained if there was no association between the exposure and outcome in the source population.

Correlation coefficients

Correlation coefficients are used when comparing two quantitative variables, and are based upon the covariance between these variables amongst the individuals in the study population. The covariance can be viewed as how the two variables of interest differ in individuals in relation to their mean values in the whole population, but put more simply, is a measure of how two different variables change in relation to each other. As the magnitude of this variable will depend upon the magnitudes of the variables in question, this value is 'standardised' in order to give a correlation coefficient, which lies between -1 (indicating a perfect negative correlation) and +1 (indicating a perfect positive correlation), with a coefficient of 0 indicating no correlation. Therefore, correlation coefficients measure how closely associated the two variables of interest are to each other.

Ratio measures

Although correlation coefficients are commonly used in statistical studies, epidemiological investigations often deal with binary exposures and outcomes (such as presence or absence of a proposed risk factor for disease, and presence or absence of disease itself). Therefore, ratio measures such as the prevalence ratio, the risk ratio, the rate ratio and the odds ratio are commonly used as measures of strength of association in epidemiological studies.

Understanding how these measures are calculated is best approached using a contingency table (also known as a cross tabulation), as shown below. In the columns, the individuals are divided into exposed and unexposed, whilst in the rows, individuals are divided into those who are diseased and those who are not diseased. Therefore, cell 'm1' represents all diseased individuals, cell 'n1' represents all exposed individuals, and cell 'a1' represents exposed individuals who are also diseased.

Disease status Exposed Unexposed Total
Diseased a1 a0 m1
Non-diseased b1 b0 m0
Total n1 n0 n

The measures of disease frequency which could be extracted from this table will depend on the study design used, which will be analytic in nature, as data regarding exposure has been collected.

In the case of a cross sectional study, the prevalence can be estimated amongst exposed individuals as (a1/n1), and amongst unexposed individuals as (a0/n0).

In the case of a cohort study, the incidence risk can be estimated amongst exposed individuals as (a1/n1), and amongst unexposed individuals as (a0/n0). Alternatively, the incidence rate can be estimated as (a1/[total number of animal-time units in exposed group]) amongst exposed animals and (a1/[total number of animal-time units in unexposed group]) amongst unexposed animals.

In the case of a case control study, no measures of disease frequency can be calculated. However, the odds of exposure can be estimated amongst diseased individuals as (a1/a0), and amongst nondiseased individuals as (b1/b0).