Difference between revisions of "Data types"

From WikiVet English
Jump to navigation Jump to search
Line 15: Line 15:
  
 
===Continuous===
 
===Continuous===
Continuous data exist within a range of values, with
+
Continuous data can take any of a range of values, which can only be estimated to some degree of accuracy (for example, by increasing the accuracy, the value obtained will change). As such, the possible number of different values which the data can take are infinite. Examples of types of continuous data are weight, height, volume of milk produced during a lactation, and the infectious period of a pathogen. Age may be classified as either discrete (as it is commonly measured in whole years) or continuous (as the concept of a fraction of a year is plausible) - of these, the latter is probably more appropriate. Of course, age could alternatively be categorised and treated as ordinal data.
  
 
====Interval====
 
====Interval====

Revision as of 14:00, 13 December 2010

Epidemiological investigation requires a good understanding of different data types, as this will strongly influence data analysis and interpretation. Data can broadly be classified as qualitative and quantitative, as shown below, although through manipulation, these types can be changed. Within each of these groups, data types can be classified further.

Qualitative data

Qualitative data are 'categorical' (or binary) data, and as such are often not expressed numerically. These types of data can be classified as nominal and ordinal:

Nominal

Nominal data differ from all other data types described here by lacking any order between the different categories, and can be described further as either binary ('yes/no') or categorical in nature. Examples of binary data are disease status (positive/negative), sex (male/female) and presence/absence of a factor of interest; whereas examples of categorical data are breed, coat colour, location and feed type. As there is no numerical meaning to the categories themselves, nominal data are best summarised using percentages or proportions.

Ordinal

Ordinal data are inherently categorical in nature, but have an intrinsic order to them. Examples of ordinal data are lameness score, level of agreement with a statement (Likert items), categorised weight and categorised lactation number. As can be seen in the last two examples here, ordinal data can be created through manipulation of quantitative data. It should be noted that even if numbers are used to describe these categories, these numbers do not necessarily follow the same scale (for example, the difference between a lameness score of 5 and 3 is not necessarily the same as the difference between scores of 4 and 2). As for nominal data, ordinal data are commonly described in terms of percentages or proportions, although the median may also be used as a measure of central tendency.

Quantitative data

Quantitative data are numerical in nature, with a set, meaningful interval between different measurements. Quantitative data can be further classified as discrete or continuous:

Discrete

Discrete data only include integer values, with decimal places having little or no meaning. 'Count' data, derived by counting the number of events or animals of interest, are a type of discrete data. Examples of discrete data are the number of infected animals within a group, the number of episodes of pathogen shedding following initial infection, the number of piglets born per year, and the number of lactations which the animal has been through.

Continuous

Continuous data can take any of a range of values, which can only be estimated to some degree of accuracy (for example, by increasing the accuracy, the value obtained will change). As such, the possible number of different values which the data can take are infinite. Examples of types of continuous data are weight, height, volume of milk produced during a lactation, and the infectious period of a pathogen. Age may be classified as either discrete (as it is commonly measured in whole years) or continuous (as the concept of a fraction of a year is plausible) - of these, the latter is probably more appropriate. Of course, age could alternatively be categorised and treated as ordinal data.

Interval

Ratio