- Clinical science
Statistical analysis is one of the principal tools employed in epidemiology, which is primarily concerned with the study of health and disease in populations. Statistics is the science of collecting, analyzing, and interpreting data, and a good epidemiological study depends on statistical methods being employed correctly. At the same time, flaws in study design can affect statistics and lead to incorrect conclusions. Descriptive statistics measure, describe, and summarize features of a collection of data/sample without making inferences that go beyond the scope of that collection/sample. Common measures of descriptive statistics are those of central tendency and dispersion. Measures of central tendency describe the central distribution of data and include the mode, median, and mean. Measures of dispersion describe how data is distributed and include range, quartiles, variance, and deviation. The counterpart of descriptive statistics, inferential statistics, relies on data to make inferences that do go beyond the scope of the data collected and the sample from which it was obtained. Inferential statistics involves parameters such as sensitivity, specificity, positive/negative predictive values, confidence intervals, and hypothesis testing.
The values used to describe features of a sample or data set are called variables. Variables can be independent, in the sense that they are not dependent on other variables and can thus be manipulated by the researcher for the purpose of a study (e.g., administration of a certain drug), or dependent, in the sense that their value depends on another variable and, thus, cannot be manipulated by the researcher (e.g., a condition caused by a certain drug). Variables can furthermore be categorized qualitatively in categorical terms (e.g., eye color, sex, race) and quantitatively in numerical terms (e.g., age, weight, temperature).
The evaluation of diagnostic tests before approval for clinical practice is another important area of epidemiological study. It relies on inferential statistics to draw conclusions from sample groups that can be applied to the general population. See also .
- Descriptive statistics: analysis of a sample group conducted in order to measure, describe, and summarize the data collected, but not to make inferences that go beyond the scope of that sample group; employs measures of central tendency (mode, median, and mean) and measures of dispersion measures (range, quartiles, variance, and deviation)
- Inferential statistics: analysis of a sample group conducted in order to make inferences that go beyond the sample group.
- Definition: measures to describe a common, typical value of a data set (e.g., clustering of data at a specific value)
- The type of measure used depends on the sample size.
|Measures of central tendency||Definition||Example|
|Mean (statistics)|| || |
|Median (statistics)|| |
|Mode (statistics)|| |
- Definition: a data point/observation that is distant from other data points/observations in a data set
- Using a trimmed mean: calculate the mean by discarding extreme values in a data set and using the remaining values
- Use the median or mode: useful for asymmetrical data; these measures are not affected by extreme values because they are based on ranks of data (median) or the most commonly occurring value (mode) rather than the average score of all values
- Removing outliers can also distort the interpretation of data. It should be done with caution and with a view to reflecting the respective data set.
- Definition: measures the extent to which the distribution is stretched out
|Range (statistics)|| || |
|Interquartile range|| |
|Variance (statistics)|| |
|Standard deviation (SD)|| || |
|Percentiles|| || |
|Standard error of the mean|
- Definition: measured values of population attributes or a value subject to change
- Types of quantitative variables
- Categorical variable (nominal variable): variables that have a finite number of categories that may not have an intrinsic logical order
- Variable scales
|Nominal scale|| || |
|Ordinal scale|| |
|Interval scale|| || |
|Ratio scale|| |
- Normal distributions differ according to their mean and variance, but share the following characteristics:
- The same basic shape; the following assumptions about the data distribution can be made:
- Symmetry (i.e., a symmetrical bell curve)
- Total area under the curve = 1
- All measures of central tendency are equal (mean = median = mode)
- Standard normal distribution (Z distribution): A normal distribution with a mean of 0 and standard deviation of 1
|Bimodal distribution|| || |
Standard normal value (Z-score; Z-value; standard normalized score)
- Enables the comparison of populations with different means and standard deviations
- Standard normal value = (value - population mean) divided by standard deviation
- A means of expressing data scores (e.g., height in centimeters or meters) in the same metric (specifically, in terms of units of standard deviation for the population)
- Determines how many standard deviations an observation is above or below the mean
Recommended measures according to distribution
|Distribution||Measures of central tendency||Measure of spread|
|Skewed (asymmetrical)|| |
- Presents data values for each category in a table
- Illustrates which values in a data set appear frequently
- Describes the frequency of categories in a circular graph divided into slices, with each slice representing a categorical proportion
- Useful for depicting a small number of categories and large differences between them
- Describes the frequency of categories in bars separated from each other; the height/length of each bar represents a categorical proportion
- Useful for depicting many categories of information (compared to a pie chart)
- Frequency can be expressed in absolute or relative terms.
- A histogram is similar to a bar graph but displays data on a metric scale.
- The data is grouped into intervals that are plotted on the x-axis.
- Useful for depicting continuous data
- Similar to a bar chart, but differs in the following ways:
- Used for continuous data
- The bars can be shown touching each other to illustrate continuous data.
- Bars cannot be reordered.
- Quartiles and median are used to display numerical data in the form of a box.
- Useful for depicting continuous data
- Shows the following important characteristics of data:
- Easily shows measures of central tendency, range, symmetry, and outliers at a glance
- A graph used to display values for (typically) two variables of data, plotted on the horizontal (x-axis) and vertical (y-axis) axes using cartesian coordinates, which represent individual data values
- Helps to establish correlations between dependent and independent variables
- Helps to determine whether a relationship between data sets is linear or nonlinear
- Two mutually exclusive hypotheses (referred to as null hypothesis and alternative hypotheses) are formulated.
- Null hypothesis (H0): the assumption that there is no relationship between two measured variables (e.g., the exposure and the outcome) or no significant difference between two studied populations; statistical tests are used to either reject or accept this hypothesis
- Alternative hypothesis (H1): the assumption that there is a relationship between two measured variables (e.g., the exposure and the outcome) or a significant difference between two studied populations. This hypothesis is formulated as a counterpart to the null hypothesis; statistical tests are used to either reject or accept this hypothesis
- Type 1 error: The null hypothesis is rejected when it is actually true, and, consequently, the alternative hypothesis is accepted, although the observed effect is actually due to chance.
- Type 2 error: The null hypothesis is accepted when it is actually false, and, consequently, the alternative hypothesis is rejected even though an observed effect did not occur due to chance.
Statistical power (1-β)
- The probability of correctly rejecting the null hypothesis, i.e., the ability to detect a difference between two groups when there truly is a difference
- Reciprocal to the type 2 error rate
- Positively correlates with the sample size and the magnitude of the association of interest (e.g., increasing the sample size of a study would increase its statistical power)
- By convention, most studies aim to achieve 80% statistical power.
- P-value: the probability that a statistical test leads to the false conclusion that there is a relationship between two measured variables (e.g., the exposure and the outcome) or that there is a significant difference between two studied populations
|Null hypothesis (H0) is true||Null hypothesis (H0) is false|
|Statistical test does not reject H0||1-α||Type 2 error (β)|
|Statistical test rejects H0||Type 1 error (α)||Power (1-β)|
Probability of an occurring event (P)
- Describes the degree of certainty that a particular event will take place
- P = number of favorable outcomes/total number of possible outcomes
Probability of an event not occurring (Q)
- The degree of certainty that a particular event will not take place
- Q = number of unfavourable outcomes/total number of possible outcomes OR 1 - P
- Probability of an occurring event (P)
- Probabilities can be combined for individual, unrelated events by multiplying them by one another.
- Probabilities can be combined for multiple, unrelated (i.e., exclusive) events by addition
- Probabilities can be combined for multiple events that are related (i.e., NOT exclusive) by adding the probability of each event and then subtracting the probability of the combined events
The actual probability of an event is not the same as the observed frequency of an event!
- Overview: Confidence intervals provide a way to determine a population measurement or a value that is subject to change from a sample measurement.
- Definition: the range of values that are highly likely to contain the true sample measurement
Formula: any sample measurement (e.g., mean) +/- () ; requires the following values:
- Confidence level (usually fixed at 95% )
- Sample measurement
, which requires the:
- Sample size
- Standard deviation
- Overlapping confidence intervals between two groups signify that there is no statistically significant difference.
- Non-overlapping confidence intervals between two groups signify that there is a statistically significant difference.
- If the confidence interval includes the null hypothesis, the result is not significant and the cannot be rejected.
- A 95% confidence interval that does not include the null hypothesis corresponds to a p-value of 0.05
- A 99% confidence interval that does not include the null hypothesis corresponds to a p-value of 0.01
Statistical significance vs. clinical significance
Significance (epidemiology): the statistical probability that a result did not occur by chance alone
- Statistical significance: describes a true statistical outcome (i.e., that is determined by statistical tests) that has not occurred by chance
- Clinical significance (epidemiology): describes an important change in a patient's clinical condition, which may or may not be due to an intervention introduced during a clinical study
- Statistical and clinical significance do not necessarily correlate.
Correlation and regression
- Definition: : a measure of the linear statistical correlation between continuous variables
- Interpretation: A correlation coefficient measures the strength (i.e., the degree) and direction (i.e., a positive or negative relationship) of a linear relationship (does not require causality!)
- Definition: the process of developing a mathematical relationship between the dependent variable (the outcome; “y”) and one or more independent variables (the exposure; “x”)
Linear regression: a type of regression in which the dependent variable is continuous
Simple linear regression
- 1 independent variable is analyzed
- If “y” has a linear relationship with an independent variable “x”, a graph plotting this relationship takes the form of a straight line (called regression line).
- In the case of simple linear regression, the equation of the regression line is: y = mx + b, with “m” representing the slope of the regression line, “y” the dependent variable, “x” the independent variable, and “b” the y-intercept (the value of y where the line crosses the y-axis)
- Multiple linear regression: >1 independent variable is analyzed
- Simple linear regression
- Logistic regression: a type of regression in which the dependent variable is categorical
- Definition: tests used to evaluate statistically significant differences between groups when the study sample has a normal distribution and the sample size is large
Pearson correlation coefficient (r)
- Compares interval level variables
- Calculates the estimated strength and direction of a relationship between two variables
- r is always a value between -1 and 1.
- A positive r-value = a positive correlation
- A negative r-value = negative correlation
- The closer the r-value is to 1, the stronger the correlation between the compared variables.
- The coefficient of determination = r2 (the coefficient may be affected by extreme values and indicates the proportion of a variable's variance that can be predicted by the variance of another variable)
- Calculates the difference between the means of two samples or between a sample and population or a value subject to change; especially when samples are small and/or the population or a value subject to change distribution is not known
- Used to determine the confidence intervals of a t-distribution
Two sample t-test
- Calculates whether the means of two groups differ from one another
- Unpaired t-test (independent samples t-test)
- Paired t-test (dependent samples t-test)
- Two sample t-test
Analysis of variance (ANOVA)
Calculates the statistically significant difference between ≥ 3 independent groups by comparing their means (an extension of the t-test)
- One-way analysis of variance
- Two-way analysis of variance: assesses 2 variables (e.g., the mean height of women and the mean height of men in clinics A, B, and C at a point in time; the variables are gender and height)
- Calculates the statistically significant difference between ≥ 3 independent groups by comparing their means (an extension of the t-test)
- Pearson correlation coefficient (r)
- Definition: tests used to evaluate the statistically significant difference between groups when the sample has non-normal distribution and the sample size is small.
Spearman correlation coefficient
- Calculates the relationship between two variables according to their rank
- Compares ordinal level variables
- Extreme values have a minimal effect on Spearman's coefficient.
- Not precise because not all information from the data set is used.
- See .
Mann-Whitney U test
- Compares ordinal, interval, or ratio scales
- Calculates whether two independently chosen samples originate from the same population and have identical distributions and/or medians
Wilcoxon test (rank sum and signed rank)
- Rank sum test: compares the means between groups of different sizes
- Signed rank test: compares the means between pairs of scores that can be matched; substitute for the one-sample t-test when a pre-intervention measure is compared with a post-treatment measure and the null hypothesis is that the treatment has no effect
- Spearman correlation coefficient
- Definition: tests used to evaluate the statistically significant difference between groups with categorical variables (no mean values)
Chi-square test (X2 test)
- Calculates the difference between the frequencies in a sample
- Aims to determine how likely outcomes are to occur due to chance (used in cross-sectional studies)
Fishers exact test
- Also calculates the difference between the frequencies in a sample but, unlike a Chi-square test, is used when the study sample is small
- Also aims to determine how likely it was the outcomes occurred due to chance
- Chi-square test (X2 test)
To remember that the Chi-square test is used for categorical variables, think “Chi-tegorical”.