• Privacy Policy

Research Method

Home » ANOVA (Analysis of variance) – Formulas, Types, and Examples

ANOVA (Analysis of variance) – Formulas, Types, and Examples

Table of Contents

ANOVA

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) is a statistical method used to test differences between two or more means. It is similar to the t-test, but the t-test is generally used for comparing two means, while ANOVA is used when you have more than two means to compare.

ANOVA is based on comparing the variance (or variation) between the data samples to the variation within each particular sample. If the between-group variance is high and the within-group variance is low, this provides evidence that the means of the groups are significantly different.

ANOVA Terminology

When discussing ANOVA, there are several key terms to understand:

  • Factor : This is another term for the independent variable in your analysis. In a one-way ANOVA, there is one factor, while in a two-way ANOVA, there are two factors.
  • Levels : These are the different groups or categories within a factor. For example, if the factor is ‘diet’ the levels might be ‘low fat’, ‘medium fat’, and ‘high fat’.
  • Response Variable : This is the dependent variable or the outcome that you are measuring.
  • Within-group Variance : This is the variance or spread of scores within each level of your factor.
  • Between-group Variance : This is the variance or spread of scores between the different levels of your factor.
  • Grand Mean : This is the overall mean when you consider all the data together, regardless of the factor level.
  • Treatment Sums of Squares (SS) : This represents the between-group variability. It is the sum of the squared differences between the group means and the grand mean.
  • Error Sums of Squares (SS) : This represents the within-group variability. It’s the sum of the squared differences between each observation and its group mean.
  • Total Sums of Squares (SS) : This is the sum of the Treatment SS and the Error SS. It represents the total variability in the data.
  • Degrees of Freedom (df) : The degrees of freedom are the number of values that have the freedom to vary when computing a statistic. For example, if you have ‘n’ observations in one group, then the degrees of freedom for that group is ‘n-1’.
  • Mean Square (MS) : Mean Square is the average squared deviation and is calculated by dividing the sum of squares by the corresponding degrees of freedom.
  • F-Ratio : This is the test statistic for ANOVAs, and it’s the ratio of the between-group variance to the within-group variance. If the between-group variance is significantly larger than the within-group variance, the F-ratio will be large and likely significant.
  • Null Hypothesis (H0) : This is the hypothesis that there is no difference between the group means.
  • Alternative Hypothesis (H1) : This is the hypothesis that there is a difference between at least two of the group means.
  • p-value : This is the probability of obtaining a test statistic as extreme as the one that was actually observed, assuming that the null hypothesis is true. If the p-value is less than the significance level (usually 0.05), then the null hypothesis is rejected in favor of the alternative hypothesis.
  • Post-hoc tests : These are follow-up tests conducted after an ANOVA when the null hypothesis is rejected, to determine which specific groups’ means (levels) are different from each other. Examples include Tukey’s HSD, Scheffe, Bonferroni, among others.

Types of ANOVA

Types of ANOVA are as follows:

One-way (or one-factor) ANOVA

This is the simplest type of ANOVA, which involves one independent variable . For example, comparing the effect of different types of diet (vegetarian, pescatarian, omnivore) on cholesterol level.

Two-way (or two-factor) ANOVA

This involves two independent variables. This allows for testing the effect of each independent variable on the dependent variable , as well as testing if there’s an interaction effect between the independent variables on the dependent variable.

Repeated Measures ANOVA

This is used when the same subjects are measured multiple times under different conditions, or at different points in time. This type of ANOVA is often used in longitudinal studies.

Mixed Design ANOVA

This combines features of both between-subjects (independent groups) and within-subjects (repeated measures) designs. In this model, one factor is a between-subjects variable and the other is a within-subjects variable.

Multivariate Analysis of Variance (MANOVA)

This is used when there are two or more dependent variables. It tests whether changes in the independent variable(s) correspond to changes in the dependent variables.

Analysis of Covariance (ANCOVA)

This combines ANOVA and regression. ANCOVA tests whether certain factors have an effect on the outcome variable after removing the variance for which quantitative covariates (interval variables) account. This allows the comparison of one variable outcome between groups, while statistically controlling for the effect of other continuous variables that are not of primary interest.

Nested ANOVA

This model is used when the groups can be clustered into categories. For example, if you were comparing students’ performance from different classrooms and different schools, “classroom” could be nested within “school.”

ANOVA Formulas

ANOVA Formulas are as follows:

Sum of Squares Total (SST)

This represents the total variability in the data. It is the sum of the squared differences between each observation and the overall mean.

  • yi represents each individual data point
  • y_mean represents the grand mean (mean of all observations)

Sum of Squares Within (SSW)

This represents the variability within each group or factor level. It is the sum of the squared differences between each observation and its group mean.

  • yij represents each individual data point within a group
  • y_meani represents the mean of the ith group

Sum of Squares Between (SSB)

This represents the variability between the groups. It is the sum of the squared differences between the group means and the grand mean, multiplied by the number of observations in each group.

  • ni represents the number of observations in each group
  • y_mean represents the grand mean

Degrees of Freedom

The degrees of freedom are the number of values that have the freedom to vary when calculating a statistic.

For within groups (dfW):

For between groups (dfB):

For total (dfT):

  • N represents the total number of observations
  • k represents the number of groups

Mean Squares

Mean squares are the sum of squares divided by the respective degrees of freedom.

Mean Squares Between (MSB):

Mean Squares Within (MSW):

F-Statistic

The F-statistic is used to test whether the variability between the groups is significantly greater than the variability within the groups.

If the F-statistic is significantly higher than what would be expected by chance, we reject the null hypothesis that all group means are equal.

Examples of ANOVA

Examples 1:

Suppose a psychologist wants to test the effect of three different types of exercise (yoga, aerobic exercise, and weight training) on stress reduction. The dependent variable is the stress level, which can be measured using a stress rating scale.

Here are hypothetical stress ratings for a group of participants after they followed each of the exercise regimes for a period:

  • Yoga: [3, 2, 2, 1, 2, 2, 3, 2, 1, 2]
  • Aerobic Exercise: [2, 3, 3, 2, 3, 2, 3, 3, 2, 2]
  • Weight Training: [4, 4, 5, 5, 4, 5, 4, 5, 4, 5]

The psychologist wants to determine if there is a statistically significant difference in stress levels between these different types of exercise.

To conduct the ANOVA:

1. State the hypotheses:

  • Null Hypothesis (H0): There is no difference in mean stress levels between the three types of exercise.
  • Alternative Hypothesis (H1): There is a difference in mean stress levels between at least two of the types of exercise.

2. Calculate the ANOVA statistics:

  • Compute the Sum of Squares Between (SSB), Sum of Squares Within (SSW), and Sum of Squares Total (SST).
  • Calculate the Degrees of Freedom (dfB, dfW, dfT).
  • Calculate the Mean Squares Between (MSB) and Mean Squares Within (MSW).
  • Compute the F-statistic (F = MSB / MSW).

3. Check the p-value associated with the calculated F-statistic.

  • If the p-value is less than the chosen significance level (often 0.05), then we reject the null hypothesis in favor of the alternative hypothesis. This suggests there is a statistically significant difference in mean stress levels between the three exercise types.

4. Post-hoc tests

  • If we reject the null hypothesis, we conduct a post-hoc test to determine which specific groups’ means (exercise types) are different from each other.

Examples 2:

Suppose an agricultural scientist wants to compare the yield of three varieties of wheat. The scientist randomly selects four fields for each variety and plants them. After harvest, the yield from each field is measured in bushels. Here are the hypothetical yields:

The scientist wants to know if the differences in yields are due to the different varieties or just random variation.

Here’s how to apply the one-way ANOVA to this situation:

  • Null Hypothesis (H0): The means of the three populations are equal.
  • Alternative Hypothesis (H1): At least one population mean is different.
  • Calculate the Degrees of Freedom (dfB for between groups, dfW for within groups, dfT for total).
  • If the p-value is less than the chosen significance level (often 0.05), then we reject the null hypothesis in favor of the alternative hypothesis. This would suggest there is a statistically significant difference in mean yields among the three varieties.
  • If we reject the null hypothesis, we conduct a post-hoc test to determine which specific groups’ means (wheat varieties) are different from each other.

How to Conduct ANOVA

Conducting an Analysis of Variance (ANOVA) involves several steps. Here’s a general guideline on how to perform it:

  • Null Hypothesis (H0): The means of all groups are equal.
  • Alternative Hypothesis (H1): At least one group mean is different from the others.
  • The significance level (often denoted as α) is usually set at 0.05. This implies that you are willing to accept a 5% chance that you are wrong in rejecting the null hypothesis.
  • Data should be collected for each group under study. Make sure that the data meet the assumptions of an ANOVA: normality, independence, and homogeneity of variances.
  • Calculate the Degrees of Freedom (df) for each sum of squares (dfB, dfW, dfT).
  • Compute the Mean Squares Between (MSB) and Mean Squares Within (MSW) by dividing the sum of squares by the corresponding degrees of freedom.
  • Compute the F-statistic as the ratio of MSB to MSW.
  • Determine the critical F-value from the F-distribution table using dfB and dfW.
  • If the calculated F-statistic is greater than the critical F-value, reject the null hypothesis.
  • If the p-value associated with the calculated F-statistic is smaller than the significance level (0.05 typically), you reject the null hypothesis.
  • If you rejected the null hypothesis, you can conduct post-hoc tests (like Tukey’s HSD) to determine which specific groups’ means (if you have more than two groups) are different from each other.
  • Regardless of the result, report your findings in a clear, understandable manner. This typically includes reporting the test statistic, p-value, and whether the null hypothesis was rejected.

When to use ANOVA

ANOVA (Analysis of Variance) is used when you have three or more groups and you want to compare their means to see if they are significantly different from each other. It is a statistical method that is used in a variety of research scenarios. Here are some examples of when you might use ANOVA:

  • Comparing Groups : If you want to compare the performance of more than two groups, for example, testing the effectiveness of different teaching methods on student performance.
  • Evaluating Interactions : In a two-way or factorial ANOVA, you can test for an interaction effect. This means you are not only interested in the effect of each individual factor, but also whether the effect of one factor depends on the level of another factor.
  • Repeated Measures : If you have measured the same subjects under different conditions or at different time points, you can use repeated measures ANOVA to compare the means of these repeated measures while accounting for the correlation between measures from the same subject.
  • Experimental Designs : ANOVA is often used in experimental research designs when subjects are randomly assigned to different conditions and the goal is to compare the means of the conditions.

Here are the assumptions that must be met to use ANOVA:

  • Normality : The data should be approximately normally distributed.
  • Homogeneity of Variances : The variances of the groups you are comparing should be roughly equal. This assumption can be tested using Levene’s test or Bartlett’s test.
  • Independence : The observations should be independent of each other. This assumption is met if the data is collected appropriately with no related groups (e.g., twins, matched pairs, repeated measures).

Applications of ANOVA

The Analysis of Variance (ANOVA) is a powerful statistical technique that is used widely across various fields and industries. Here are some of its key applications:

Agriculture

ANOVA is commonly used in agricultural research to compare the effectiveness of different types of fertilizers, crop varieties, or farming methods. For example, an agricultural researcher could use ANOVA to determine if there are significant differences in the yields of several varieties of wheat under the same conditions.

Manufacturing and Quality Control

ANOVA is used to determine if different manufacturing processes or machines produce different levels of product quality. For instance, an engineer might use it to test whether there are differences in the strength of a product based on the machine that produced it.

Marketing Research

Marketers often use ANOVA to test the effectiveness of different advertising strategies. For example, a marketer could use ANOVA to determine whether different marketing messages have a significant impact on consumer purchase intentions.

Healthcare and Medicine

In medical research, ANOVA can be used to compare the effectiveness of different treatments or drugs. For example, a medical researcher could use ANOVA to test whether there are significant differences in recovery times for patients who receive different types of therapy.

ANOVA is used in educational research to compare the effectiveness of different teaching methods or educational interventions. For example, an educator could use it to test whether students perform significantly differently when taught with different teaching methods.

Psychology and Social Sciences

Psychologists and social scientists use ANOVA to compare group means on various psychological and social variables. For example, a psychologist could use it to determine if there are significant differences in stress levels among individuals in different occupations.

Biology and Environmental Sciences

Biologists and environmental scientists use ANOVA to compare different biological and environmental conditions. For example, an environmental scientist could use it to determine if there are significant differences in the levels of a pollutant in different bodies of water.

Advantages of ANOVA

Here are some advantages of using ANOVA:

Comparing Multiple Groups: One of the key advantages of ANOVA is the ability to compare the means of three or more groups. This makes it more powerful and flexible than the t-test, which is limited to comparing only two groups.

Control of Type I Error: When comparing multiple groups, the chances of making a Type I error (false positive) increases. One of the strengths of ANOVA is that it controls the Type I error rate across all comparisons. This is in contrast to performing multiple pairwise t-tests which can inflate the Type I error rate.

Testing Interactions: In factorial ANOVA, you can test not only the main effect of each factor, but also the interaction effect between factors. This can provide valuable insights into how different factors or variables interact with each other.

Handling Continuous and Categorical Variables: ANOVA can handle both continuous and categorical variables . The dependent variable is continuous and the independent variables are categorical.

Robustness: ANOVA is considered robust to violations of normality assumption when group sizes are equal. This means that even if your data do not perfectly meet the normality assumption, you might still get valid results.

Provides Detailed Analysis: ANOVA provides a detailed breakdown of variances and interactions between variables which can be useful in understanding the underlying factors affecting the outcome.

Capability to Handle Complex Experimental Designs: Advanced types of ANOVA (like repeated measures ANOVA, MANOVA, etc.) can handle more complex experimental designs, including those where measurements are taken on the same subjects over time, or when you want to analyze multiple dependent variables at once.

Disadvantages of ANOVA

Some limitations or disadvantages that are important to consider:

Assumptions: ANOVA relies on several assumptions including normality (the data follows a normal distribution), independence (the observations are independent of each other), and homogeneity of variances (the variances of the groups are roughly equal). If these assumptions are violated, the results of the ANOVA may not be valid.

Sensitivity to Outliers: ANOVA can be sensitive to outliers. A single extreme value in one group can affect the sum of squares and consequently influence the F-statistic and the overall result of the test.

Dichotomous Variables: ANOVA is not suitable for dichotomous variables (variables that can take only two values, like yes/no or male/female). It is used to compare the means of groups for a continuous dependent variable.

Lack of Specificity: Although ANOVA can tell you that there is a significant difference between groups, it doesn’t tell you which specific groups are significantly different from each other. You need to carry out further post-hoc tests (like Tukey’s HSD or Bonferroni) for these pairwise comparisons.

Complexity with Multiple Factors: When dealing with multiple factors and interactions in factorial ANOVA, interpretation can become complex. The presence of interaction effects can make main effects difficult to interpret.

Requires Larger Sample Sizes: To detect an effect of a certain size, ANOVA generally requires larger sample sizes than a t-test.

Equal Group Sizes: While not always a strict requirement, ANOVA is most powerful and its assumptions are most likely to be met when groups are of equal or similar sizes.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Regression Analysis

Regression Analysis – Methods, Types and Examples

Textual Analysis

Textual Analysis – Types, Examples and Guide

Phenomenology

Phenomenology – Methods, Examples and Guide

Probability Histogram

Probability Histogram – Definition, Examples and...

Symmetric Histogram

Symmetric Histogram – Examples and Making Guide

Data Analysis

Data Analysis – Process, Methods and Types

ANOVA Test: Definition, Types, Examples, SPSS

Statistics Definitions > ANOVA Contents :

The ANOVA Test

  • How to Run a One Way ANOVA in SPSS

Two Way ANOVA

What is manova, what is factorial anova, how to run an anova, anova vs. t test.

  • Repeated Measures ANOVA in SPSS: Steps

Related Articles

An ANOVA test is a way to find out if survey or experiment results are significant . In other words, they help you to figure out if you need to reject the null hypothesis or accept the alternate hypothesis .

Basically, you’re testing groups to see if there’s a difference between them. Examples of when you might want to test different groups:

  • A group of psychiatric patients are trying three different therapies: counseling, medication and biofeedback. You want to see if one therapy is better than the others.
  • A manufacturer has two different processes to make light bulbs. They want to know if one process is better than the other.
  • Students from different colleges take the same exam. You want to see if one college outperforms the other.

What Does “One-Way” or “Two-Way Mean?

One-way or two-way refers to the number of independent variables (IVs) in your Analysis of Variance test.

  • One-way has one independent variable (with 2 levels ). For example: brand of cereal ,
  • Two-way has two independent variables (it can have multiple levels). For example: brand of cereal, calories .

What are “Groups” or “Levels”?

Groups or levels are different groups within the same independent variable . In the above example, your levels for “brand of cereal” might be Lucky Charms, Raisin Bran, Cornflakes — a total of three levels. Your levels for “Calories” might be: sweetened, unsweetened — a total of two levels.

Let’s say you are studying if an alcoholic support group and individual counseling combined is the most effective treatment for lowering alcohol consumption. You might split the study participants into three groups or levels:

  • Medication only,
  • Medication and counseling,
  • Counseling only.

Your dependent variable would be the number of alcoholic beverages consumed per day.

If your groups or levels have a hierarchical structure (each level has unique subgroups), then use a nested ANOVA for the analysis.

What Does “Replication” Mean?

It’s whether you are replicating (i.e. duplicating) your test(s) with multiple groups. With a two way ANOVA with replication , you have two groups and individuals within that group are doing more than one thing (i.e. two groups of students from two colleges taking two tests). If you only have one group taking two tests, you would use without replication.

Types of Tests.

There are two main types: one-way and two-way. Two-way tests can be with or without replication.

  • One-way ANOVA between groups: used when you want to test two groups to see if there’s a difference between them.
  • Two way ANOVA without replication: used when you have one group and you’re double-testing that same group. For example, you’re testing one set of individuals before and after they take a medication to see if it works or not.
  • Two way ANOVA with replication: Two groups , and the members of those groups are doing more than one thing . For example, two groups of patients from different hospitals trying two different therapies.

Back to Top

One Way ANOVA

A one way ANOVA is used to compare two means from two independent (unrelated) groups using the F-distribution . The null hypothesis for the test is that the two means are equal. Therefore, a significant result means that the two means are unequal.

Examples of when to use a one way ANOVA

Situation 1: You have a group of individuals randomly split into smaller groups and completing different tasks. For example, you might be studying the effects of tea on weight loss and form three groups: green tea, black tea, and no tea. Situation 2: Similar to situation 1, but in this case the individuals are split into groups based on an attribute they possess. For example, you might be studying leg strength of people according to weight. You could split participants into weight categories (obese, overweight and normal) and measure their leg strength on a weight machine.

Limitations of the One Way ANOVA

A one way ANOVA will tell you that at least two groups were different from each other. But it won’t tell you which groups were different. If your test returns a significant f-statistic, you may need to run an ad hoc test (like the Least Significant Difference test) to tell you exactly which groups had a difference in means . Back to Top

How to run a One Way ANOVA in SPSS

example anova research question

A Two Way ANOVA is an extension of the One Way ANOVA. With a One Way, you have one independent variable affecting a dependent variable . With a Two Way ANOVA, there are two independents. Use a two way ANOVA when you have one measurement variable (i.e. a quantitative variable ) and two nominal variables . In other words, if your experiment has a quantitative outcome and you have two categorical explanatory variables , a two way ANOVA is appropriate.

For example, you might want to find out if there is an interaction between income and gender for anxiety level at job interviews. The anxiety level is the outcome, or the variable that can be measured. Gender and Income are the two categorical variables . These categorical variables are also the independent variables, which are called factors in a Two Way ANOVA.

The factors can be split into levels . In the above example, income level could be split into three levels: low, middle and high income. Gender could be split into three levels: male, female, and transgender. Treatment groups are all possible combinations of the factors. In this example there would be 3 x 3 = 9 treatment groups.

Main Effect and Interaction Effect

The results from a Two Way ANOVA will calculate a main effect and an interaction effect . The main effect is similar to a One Way ANOVA: each factor’s effect is considered separately. With the interaction effect, all factors are considered at the same time. Interaction effects between factors are easier to test if there is more than one observation in each cell. For the above example, multiple stress scores could be entered into cells. If you do enter multiple observations into cells, the number in each cell must be equal.

Two null hypotheses are tested if you are placing one observation in each cell. For this example, those hypotheses would be: H 01 : All the income groups have equal mean stress. H 02 : All the gender groups have equal mean stress.

For multiple observations in cells, you would also be testing a third hypothesis: H 03 : The factors are independent or the interaction effect does not exist.

An F-statistic is computed for each hypothesis you are testing.

Assumptions for Two Way ANOVA

  • The population must be close to a normal distribution .
  • Samples must be independent.
  • Population variances must be equal (i.e. homoscedastic ).
  • Groups must have equal sample sizes .

MANOVA is just an ANOVA with several dependent variables. It’s similar to many other tests and experiments in that it’s purpose is to find out if the response variable (i.e. your dependent variable) is changed by manipulating the independent variable. The test helps to answer many research questions, including:

  • Do changes to the independent variables have statistically significant effects on dependent variables?
  • What are the interactions among dependent variables?
  • What are the interactions among independent variables?

MANOVA Example

Suppose you wanted to find out if a difference in textbooks affected students’ scores in math and science. Improvements in math and science means that there are two dependent variables, so a MANOVA is appropriate.

An ANOVA will give you a single ( univariate ) f-value while a MANOVA will give you a multivariate F value. MANOVA tests the multiple dependent variables by creating new, artificial, dependent variables that maximize group differences. These new dependent variables are linear combinations of the measured dependent variables.

Interpreting the MANOVA results

If the multivariate F value indicates the test is statistically significant , this means that something is significant. In the above example, you would not know if math scores have improved, science scores have improved (or both). Once you have a significant result, you would then have to look at each individual component (the univariate F tests) to see which dependent variable(s) contributed to the statistically significant result.

Advantages and Disadvantages of MANOVA vs. ANOVA

  • MANOVA enables you to test multiple dependent variables.
  • MANOVA can protect against Type I errors.

Disadvantages

  • MANOVA is many times more complicated than ANOVA, making it a challenge to see which independent variables are affecting dependent variables.
  • One degree of freedom is lost with the addition of each new variable .
  • The dependent variables should be uncorrelated as much as possible. If they are correlated, the loss in degrees of freedom means that there isn’t much advantages in including more than one dependent variable on the test.

Reference : SFSU. Retrieved April 18, 2022 from: http://online.sfsu.edu/efc/classes/biol710/manova/MANOVAnewest.pdf

A factorial ANOVA is an Analysis of Variance test with more than one independent variable , or “ factor “. It can also refer to more than one Level of Independent Variable . For example, an experiment with a treatment group and a control group has one factor (the treatment) but two levels (the treatment and the control). The terms “two-way” and “three-way” refer to the number of factors or the number of levels in your test. Four-way ANOVA and above are rarely used because the results of the test are complex and difficult to interpret.

  • A two-way ANOVA has two factors ( independent variables ) and one dependent variable . For example, time spent studying and prior knowledge are factors that affect how well you do on a test.
  • A three-way ANOVA has three factors (independent variables) and one dependent variable. For example, time spent studying, prior knowledge, and hours of sleep are factors that affect how well you do on a test

Factorial ANOVA is an efficient way of conducting a test. Instead of performing a series of experiments where you test one independent variable against one dependent variable, you can test all independent variables at the same time.

Variability

In a one-way ANOVA, variability is due to the differences between groups and the differences within groups. In factorial ANOVA, each level and factor are paired up with each other (“crossed”). This helps you to see what interactions are going on between the levels and factors. If there is an interaction then the differences in one factor depend on the differences in another.

Let’s say you were running a two-way ANOVA to test male/female performance on a final exam. The subjects had either had 4, 6, or 8 hours of sleep.

  • IV1: SEX (Male/Female)
  • IV2: SLEEP (4/6/8)
  • DV: Final Exam Score

A two-way factorial ANOVA would help you answer the following questions:

  • Is sex a main effect? In other words, do men and women differ significantly on their exam performance?
  • Is sleep a main effect? In other words, do people who have had 4,6, or 8 hours of sleep differ significantly in their performance?
  • Is there a significant interaction between factors? In other words, how do hours of sleep and sex interact with regards to exam performance?
  • Can any differences in sex and exam performance be found in the different levels of sleep?

Assumptions of Factorial ANOVA

  • Normality: the dependent variable is normally distributed.
  • Independence: Observations and groups are independent from each other.
  • Equality of Variance: the population variances are equal across factors/levels.

These tests are very time-consuming by hand. In nearly every case you’ll want to use software. For example, several options are available in Excel :

  • Two way ANOVA in Excel with replication and without replication.
  • One way ANOVA in Excel 2013 .

how to run anova in excel

ANOVA tests in statistics packages are run on parametric data. If you have rank or ordered data, you’ll want to run a non-parametric ANOVA (usually found under a different heading in the software, like “ nonparametric tests “).

It is unlikely you’ll want to do this test by hand, but if you must, these are the steps you’ll want to take:

  • Find the mean for each of the groups.
  • Find the overall mean (the mean of the groups combined).
  • Find the Within Group Variation ; the total deviation of each member’s score from the Group Mean.
  • Find the Between Group Variation : the deviation of each Group Mean from the Overall Mean.
  • Find the F statistic: the ratio of Between Group Variation to Within Group Variation.

A Student’s t-test will tell you if there is a significant variation between groups. A t-test compares means, while the ANOVA compares variances between populations. You could technically perform a series of t-tests on your data. However, as the groups grow in number, you may end up with a lot of pair comparisons that you need to run. ANOVA will give you a single number (the f-statistic ) and one p-value to help you support or reject the null hypothesis . Back to Top

Repeated Measures (Within Subjects) ANOVA

A repeated measures ANOVA is almost the same as one-way ANOVA, with one main difference: you test related groups, not independent ones.

It’s called Repeated Measures because the same group of participants is being measured over and over again. For example, you could be studying the cholesterol levels of the same group of patients at 1, 3, and 6 months after changing their diet. For this example, the independent variable is “time” and the dependent variable is “cholesterol.” The independent variable is usually called the within-subjects factor .

Repeated measures ANOVA is similar to a simple multivariate design. In both tests, the same participants are measured over and over. However, with repeated measures the same characteristic is measured with a different condition. For example, blood pressure is measured over the condition “time”. For simple multivariate design it is the characteristic that changes. For example, you could measure blood pressure, heart rate and respiration rate over time.

Reasons to use Repeated Measures ANOVA

  • When you collect data from the same participants over a period of time, individual differences (a source of between group differences) are reduced or eliminated.
  • Testing is more powerful because the sample size isn’t divided between groups.
  • The test can be economical, as you’re using the same participants.

Assumptions for Repeated Measures ANOVA

The results from your repeated measures ANOVA will be valid only if the following assumptions haven’t been violated:

  • There must be one independent variable and one dependent variable.
  • The dependent variable must be a continuous variable , on an interval scale or a ratio scale .
  • The independent variable must be categorical , either on the nominal scale or ordinal scale.
  • Ideally, levels of dependence between pairs of groups is equal (“sphericity”). Corrections are possible if this assumption is violated.

One Way Repeated Measures ANOVA in SPSS: Steps

repeated measures

Step 2: Replace the “factor1” name with something that represents your independent variable. For example, you could put “age” or “time.”

Step 3: Enter the “Number of Levels.” This is how many times the dependent variable has been measured. For example, if you took measurements every week for a total of 4 weeks, this number would be 4.

Step 4: Click the “Add” button and then give your dependent variable a name.

repeated measures analysis of variance

Step 7: Click “Plots” and use the arrow keys to transfer the factor from the left box onto the Horizontal Axis box.

repeated measures anova 4

Step 9: Click “Options”, then transfer your factors from the left box to the Display Means for box on the right.

Step 10: Click the following check boxes:

  • Compare main effects.
  • Descriptive Statistics.
  • Estimates of Effect Size .

Step 11: Select “Bonferroni” from the drop down menu under Confidence Interval Adjustment . Step 12: Click “Continue” and then click “OK” to run the test. Back to Top

In statistics, sphericity (ε) refers to Mauchly’s sphericity test , which was developed in 1940 by John W. Mauchly , who co-developed the first general-purpose electronic computer.

Sphericity is used as an assumption in repeated measures ANOVA. The assumption states that the variances of the differences between all possible group pairs are equal. If your data violates this assumption, it can result in an increase in a Type I error (the incorrect rejection of the null hypothesis) .

It’s very common for repeated measures ANOVA to result in a violation of the assumption. If the assumption has been violated, corrections have been developed that can avoid increases in the type I error rate. The correction is applied to the degrees of freedom in the F-distribution .

Mauchly’s Sphericity Test

Mauchly’s test for sphericity can be run in the majority of statistical software, where it tends to be the default test for sphericity. Mauchly’s test is ideal for mid-size samples. It may fail to detect sphericity in small samples and it may over-detect in large samples. If the test returns a small p-value (p ≤.05), this is an indication that your data has violated the assumption. The following picture of SPSS output for ANOVA shows that the significance “sig” attached to Mauchly’s is .274. This means that the assumption has not been violated for this set of data.

You would report the above result as “Mauchly’s Test indicated that the assumption of sphericity had not been violated, χ 2 (2) = 2.588, p = .274.”

If your test returned a small p-value , you should apply a correction, usually either the:

  • Greehouse-Geisser correction.
  • Huynh-Feldt correction .

When ε ≤ 0.75 (or you don’t know what the value for the statistic is), use the Greenhouse-Geisser correction. When ε > .75, use the Huynh-Feldt correction .

Grand mean ANOVA vs Regression

Blokdyk, B. (2018). Ad Hoc Testing . 5STARCooks Miller, R. G. Beyond ANOVA: Basics of Applied Statistics . Boca Raton, FL: Chapman & Hall, 1997 Image: UVM. Retrieved December 4, 2020 from: https://www.uvm.edu/~dhowell/gradstat/psych341/lectures/RepeatedMeasures/repeated1.html

An open portfolio of interoperable, industry leading products

The Dotmatics digital science platform provides the first true end-to-end solution for scientific R&D, combining an enterprise data platform with the most widely used applications for data analysis, biologics, flow cytometry, chemicals innovation, and more.

example anova research question

Statistical analysis and graphing software for scientists

Bioinformatics, cloning, and antibody discovery software

Plan, visualize, & document core molecular biology procedures

Electronic Lab Notebook to organize, search and share data

Proteomics software for analysis of mass spec data

Modern cytometry analysis platform

Analysis, statistics, graphing and reporting of flow cytometry data

Software to optimize designs of clinical trials

The Ultimate Guide to ANOVA

Get all of your ANOVA questions answered here

ANOVA is the go-to analysis tool for classical experimental design, which forms the backbone of scientific research.

In this article, we’ll guide you through what ANOVA is, how to determine which version to use to evaluate your particular experiment, and provide detailed examples for the most common forms of ANOVA.

This includes a (brief) discussion of crossed, nested, fixed and random factors, and covers the majority of ANOVA models that a scientist would encounter before requiring the assistance of a statistician or modeling expert.

What is ANOVA used for?

ANOVA, or (Fisher’s) analysis of variance, is a critical analytical technique for evaluating differences between three or more sample means from an experiment. As the name implies, it partitions out the variance in the response variable based on one or more explanatory factors.

As you will see there are many types of ANOVA such as one-, two-, and three-way ANOVA as well as nested and repeated measures ANOVA. The graphic below shows a simple example of an experiment that requires ANOVA in which researchers measured the levels of neutrophil extracellular traps (NETs) in plasma across patients with different viral respiratory infections.

Anova Image  Viral Infections

Many researchers may not realize that, for the majority of experiments, the characteristics of the experiment that you run dictate the ANOVA that you need to use to test the results. While it’s a massive topic (with professional training needed for some of the advanced techniques), this is a practical guide covering what most researchers need to know about ANOVA.

When should I use ANOVA?

If your response variable is numeric, and you’re looking for how that number differs across several categorical groups, then ANOVA is an ideal place to start. After running an experiment, ANOVA is used to analyze whether there are differences between the mean response of one or more of these grouping factors.

ANOVA can handle a large variety of experimental factors such as repeated measures on the same experimental unit (e.g., before/during/after).

If instead of evaluating treatment differences, you want to develop a model using a set of numeric variables to predict that numeric response variable, see linear regression and t tests .

What is the difference between one-way, two-way and three-way ANOVA?

The number of “ways” in ANOVA (e.g., one-way, two-way, …) is simply the number of factors in your experiment.

Although the difference in names sounds trivial, the complexity of ANOVA increases greatly with each added factor. To use an example from agriculture, let’s say we have designed an experiment to research how different factors influence the yield of a crop.

An experiment with a single factor

In the most basic version, we want to evaluate three different fertilizers. Because we have more than two groups, we have to use ANOVA. Since there is only one factor (fertilizer), this is a one-way ANOVA. One-way ANOVA is the easiest to analyze and understand, but probably not that useful in practice, because having only one factor is a pretty simplistic experiment.

What happens when you add a second factor? 

If we have two different fields, we might want to add a second factor to see if the field itself influences growth. Within each field, we apply all three fertilizers (which is still the main interest). This is called a crossed design. In this case we have two factors, field and fertilizer, and would need a two-way ANOVA.

As you might imagine, this makes interpretation more complicated (although still very manageable) simply because more factors are involved. There is now a fertilizer effect, as well as a field effect, and there could be an interaction effect, where the fertilizer behaves differently on each field.

How about adding a third factor?

Finally, it is possible to have more than two factors in an ANOVA. In our example, perhaps you also wanted to test out different irrigation systems. You could have a three-way ANOVA due to the presence of fertilizer, field, and irrigation factors. This greatly increases the complication.

Now in addition to the three main effects (fertilizer, field and irrigation), there are three two-way interaction effects (fertilizer by field, fertilizer by irrigation, and field by irrigation), and one three-way interaction effect.

If any of the interaction effects are statistically significant, then presenting the results gets quite complicated. “Fertilizer A works better on Field B with Irrigation Method C ….”

In practice, two-way ANOVA is often as complex as many researchers want to get before consulting with a statistician. That being said, three-way ANOVAs are cumbersome, but manageable when each factor only has two levels.

What are crossed and nested factors?

In addition to increasing the difficulty with interpretation, experiments (or the resulting ANOVA) with more than one factor add another level of complexity, which is determining whether the factors are crossed or nested.

With crossed factors, every combination of levels among each factor is observed. For example, each fertilizer is applied to each field (so the fields are subdivided into three sections in this case).

With nested factors, different levels of a factor appear within another factor. An example is applying different fertilizers to each field, such as fertilizers A and B to field 1 and fertilizers C and D to field 2. See more about nested ANOVA here .

What are fixed and random factors?

Another challenging concept with two or more factors is determining whether to treat the factors as fixed or random. 

Fixed factors are used when all levels of a factor (e.g., Fertilizer A, Fertilizer B, Fertilizer C) are specified and you want to determine the effect that factor has on the mean response. 

Random factors are used when only some levels of a factor are observed (e.g., Field 1, Field 2, Field 3) out of a large or infinite possible number (e.g., all fields), but rather than specify the effect of the factor, which you can’t do because you didn’t observe all possible levels, you want to quantify the variability that’s within that factor (variability added within each field).

Many introductory courses on ANOVA only discuss fixed factors, and we will largely follow suit other than with two specific scenarios (nested factors and repeated measures). 

What are the (practical) assumptions of ANOVA?

These are one-way ANOVA assumptions, but also carryover for more complicated two-way or repeated measures ANOVA.

  • Categorical treatment or factor variables - ANOVA evaluates mean differences between one or more categorical variables (such as treatment groups), which are referred to as factors or “ways.”
  • Three or more groups - There must be at least three distinct groups (or levels of a categorical variable) across all factors in an ANOVA. The possibilities are endless: one factor of three different groups, two factors of two groups each (2x2), and so on. If you have fewer than three groups, you can probably get away with a simple t-test.
  • Numeric Response - While the groups are categorical, the data measured in each group (i.e., the response variable) still needs to be numeric. ANOVA is fundamentally a quantitative method for measuring the differences in a numeric response between groups. If your response variable isn’t continuous, then you need a more specialized modelling framework such as logistic regression or chi-square contingency table analysis to name a few.
  • Random assignment - The makeup of each experimental group should be determined by random selection.
  • Normality - The distribution within each factor combination should be approximately normal, although ANOVA is fairly robust to this assumption as the sample size increases due to the central limit theorem.

What is the formula for ANOVA?

The formula to calculate ANOVA varies depending on the number of factors, assumptions about how the factors influence the model (blocking variables, fixed or random effects, nested factors, etc.), and any potential overlap or correlation between observed values (e.g., subsampling, repeated measures). 

The good news about running ANOVA in the 21st century is that statistical software handles the majority of the tedious calculations. The main thing that a researcher needs to do is select the appropriate ANOVA.

An example formula for a two-factor crossed ANOVA is:

Anova formula

How do I know which ANOVA to use?

As statisticians, we like to imagine that you’re reading this before you’ve run your experiment. You can save a lot of headache by simplifying an experiment into a standard format (when possible) to make the analysis straightforward.

Regardless, we’ll walk you through picking the right ANOVA for your experiment and provide examples for the most popular cases. The first question is:

Do you only have a single factor of interest?

If you have only measured a single factor (e.g., fertilizer A, fertilizer B, .etc.), then use one-way ANOVA . If you have more than one, then you need to consider the following:

Are you measuring the same observational unit (e.g., subject) multiple times?

This is where repeated measures come into play and can be a really confusing question for researchers, but if this sounds like it might describe your experiment, see repeated measures ANOVA . Otherwise:

Are any of the factors nested, where the levels are different depending on the levels of another factor?

In this case, you have a nested ANOVA design. If you don’t have nested factors or repeated measures, then it becomes simple:

Do you have two categorical factors?

Then use two-way ANOVA.

Do you have three categorical factors?

Use three-way ANOVA.

Do you have variables that you recorded that aren’t categorical (such as age, weight, etc.)?

Although these are outside the scope of this guide, if you have a single continuous variable, you might be able to use ANCOVA, which allows for a continuous covariate. With multiple continuous covariates, you probably want to use a mixed model or possibly multiple linear regression.

Prism  does  offer multiple linear regression but assumes that all factors are fixed. A full “mixed model” analysis is not yet available in Prism, but is offered as options within the one- and two-way ANOVA parameters.

How do I perform ANOVA?

Once you’ve determined which ANOVA is appropriate for your experiment, use statistical software to run the calculations. Below, we provide detailed examples of one, two and three-way ANOVA models.

How do I read and interpret an ANOVA table?

Interpreting any kind of ANOVA should start with the ANOVA table in the output. These tables are what give ANOVA its name, since they partition out the variance in the response into the various factors and interaction terms. This is done by calculating the sum of squares (SS) and mean squares (MS), which can be used to determine the variance in the response that is explained by each factor.

If you have predetermined your level of significance, interpretation mostly comes down to the p-values that come from the F-tests. The null hypothesis for each factor is that there is no significant difference between groups of that factor. All of the following factors are statistically significant with a very small p-value.

2 anova table- Anova

One-way ANOVA Example

An example of one-way ANOVA is an experiment of cell growth in petri dishes. The response variable is a measure of their growth, and the variable of interest is treatment, which has three levels: formula A, formula B, and a control.

Classic one-way ANOVA assumes equal variances within each sample group. If that isn’t a valid assumption for your data, you have a number of alternatives .

Calculating a one-way ANOVA

Using Prism to do the analysis, we will run a one-way ANOVA and will choose 95% as our significance threshold. Since we are interested in the differences between each of the three groups, we will evaluate each and correct for multiple comparisons (more on this later!). 

For the following, we’ll assume equal variances within the treatment groups. Consider 

3 oneway anova summary - Anova

The first test to look at is the overall (or omnibus) F-test, with the null hypothesis that there is no significant difference between any of the treatment groups. In this case, there is a significant difference between the three groups (p<0.0001), which tells us that at least one of the groups has a statistically significant difference.

Now we can move to the heart of the issue, which is to determine which group means are statistically different. To learn more, we should graph the data and test the differences (using a multiple comparison correction).

Graphing one-way ANOVA

The easiest way to visualize the results from an ANOVA is to use a simple chart that shows all of the individual points. Rather than a bar chart, it’s best to use a plot that shows all of the data points (and means) for each group such as a scatter or violin plot.

As an example, below you can see a graph of the cell growth levels for each data point in each treatment group, along with a line to represent their mean. This can help give credence to any significant differences found, as well as show how closely groups overlap.

4 onewayanova graph - Anova

Determining statistical significance between groups

In addition to the graphic, what we really want to know is which treatment means are statistically different from each other. Because we are performing multiple tests, we’ll use a multiple comparison correction . For our example, we’ll use Tukey’s correction (although if we were only interested in the difference between each formula to the control, we could use Dunnett’s correction instead). 

In this case, the mean cell growth for Formula A is significantly  higher  than the control (p<.0001) and Formula B ( p=0.002 ), but there’s no significant difference between Formula B and the control.

5 anova multiple comparisons - Anova

Two-way ANOVA example

For two-way ANOVA, there are two factors involved. Our example will focus on a case of cell lines. Suppose we have a 2x2 design (four total groupings). There are two different treatments (serum-starved and normal culture) and two different fields. There are 19 total cell line “experimental units” being evaluated, up to 5 in each group (note that with 4 groups and 19 observational units, this study isn’t balanced). Although there are multiple units in each group, they are all completely different replicates and therefore not repeated measures of the same unit.

As with one-way ANOVA, it’s a good idea to graph the data as well as look at the ANOVA table for results.

Graphing two-way ANOVA

There are many options here. Like our one-way example, we recommend a similar graphing approach that shows all the data points themselves along with the means.

Determining statistical significance between groups in two-way ANOVA

Let’s use a two-way ANOVA with a 95% significance threshold to evaluate both factors’ effects on the response, a measure of growth.

Feel free to use our two-way ANOVA checklist as often as you need for your own analysis.

First, notice there are three sources of variation included in the model, which are interaction, treatment, and field. 

The first effect to look at is the interaction term, because if it’s significant, it changes how you interpret the main effects (e.g., treatment and field). The interaction effect calculates if the effect of a factor depends on the other factor. In this case, the significant interaction term (p<.0001) indicates that the treatment effect depends on the field type.

7 two-way-results - Anova

A significant interaction term muddies the interpretation, so that you no longer have the simple conclusion that “Treatment A outperforms Treatment B.” In this case, the graphic is particularly useful. It suggests that while there may be some difference between three of the groups, the precise combination of serum starved in field 2 outperformed the rest.

To confirm whether there is a statistically significant result, we would run pairwise comparisons (comparing each factor level combination with every other one) and account for multiple comparisons.

Do I need to correct for multiple comparisons for two-way ANOVA?

If you’re comparing the means for more than one combination of treatment groups, then absolutely! Here’s more information about multiple comparisons for two-way ANOVA .

Repeated measures ANOVA

So far we have focused almost exclusively on “ordinary” ANOVA and its differences depending on how many factors are involved. In all of these cases, each observation is completely unrelated to the others. Other than the combination of factors that may be the same across replicates, each replicate on its own is independent.

There is a second common branch of ANOVA known as repeated measures . In these cases, the units are related in that they are matched up in some way. Repeated measures are used to model correlation between measurements within an individual or subject. Repeated measures ANOVA is useful (and increases statistical power) when the variability within individuals is large relative to the variability among individuals.

It’s important that all levels of your repeated measures factor (usually time) are consistent. If they aren’t, you’ll need to consider running a mixed model, which is a more advanced statistical technique.

There are two common forms of repeated measures:

  • You observe the same individual or subject at different time points. If you’re familiar with paired t-tests, this is an extension to that. (You can also have the same individual receive all of the treatments, which adds another level of repeated measures.)
  • You have a randomized block design, where matched elements receive each treatment. For example, you split a large sample of blood taken from one person into 3 (or more) smaller samples, and each of those smaller samples gets exactly one treatment.
Repeated measures ANOVA can have any number of factors. See analysis checklists for one-way repeated measures ANOVA and two-way repeated measures ANOVA .

What does it mean to assume sphericity with repeated measures ANOVA?

Repeated measures are almost always treated as random factors, which means that the correlation structure between levels of the repeated measures needs to be defined. The assumption of sphericity means that you assume that each level of the repeated measures has the same correlation with every other level.

This is almost never the case with repeated measures over time (e.g., baseline, at treatment, 1 hour after treatment), and in those cases, we recommend not assuming sphericity. However, if you used a randomized block design, then sphericity is usually appropriate .

Example two-way ANOVA with repeated measures

Say we have two treatments (control and treatment) to evaluate using test animals. We’ll apply both treatments to each two animals (replicates) with sufficient time in between the treatments so there isn’t a crossover (or carry-over) effect. Also, we’ll measure five different time points for each treatment (baseline, at time of injection, one hour after, …). This is repeated measures because we will need to measure matching samples from the same animal under each treatment as we track how its stimulation level changes over time.

8 repmeas table - Anova

The output shows the test results from the main and interaction effects. Due to the interaction between time and treatment being significant (p<.0001), the fact that the treatment main effect isn’t significant (p=.154) isn’t noteworthy.

Graphing repeated measures ANOVA

As we’ve been saying, graphing the data is useful, and this is particularly true when the interaction term is significant. Here we get an explanation of why the interaction between treatment and time was significant, but treatment on its own was not. As soon as one hour after injection (and all time points after), treated units show a higher response level than the control even as it decreases over those 12 hours. Thus the effect of time depends on treatment. At the earlier time points, there is no difference between treatment and control.

9 repmeas graphed - Anova

Graphing repeated measures data is an art, but a good graphic helps you understand and communicate the results. For example, it’s a completely different experiment, but here’s a great plot of another repeated measures experiment with before and after values that are measured on three different animal types.

10 beforeandafter rm - Anova

What if I have three or more factors?

Interpreting three or more factors is very challenging and usually requires advanced training and experience . 

Just as two-way ANOVA is more complex than one-way, three-way ANOVA adds much more potential for confusion. Not only are you dealing with three different factors, you will now be testing seven hypotheses at the same time. Two-way interactions still exist here, and you may even run into a significant three-way interaction term.

It takes careful planning and advanced experimental design to be able to untangle the combinations that will be involved ( see more details here ). 

Non-parametric ANOVA alternatives

As with t-tests (or virtually any statistical method), there are alternatives to ANOVA for testing differences between three groups. ANOVA is means-focused and evaluated in comparison to an F-distribution. 

The two main non-parametric cousins to ANOVA are the Kruskal-Wallis and Friedman’s tests. Just as is true with everything else in ANOVA, it is likely that one of the two options is more appropriate for your experiment.

Kruskal-Wallis tests the difference between medians (rather than means) for 3 or more groups. It is only useful as an “ordinary ANOVA” alternative, without matched subjects like you have in repeated measures. Here are some tips for interpreting Kruskal-Wallis test results. 

Friedman’s Test is the opposite, designed as an alternative to repeated measures ANOVA with matched subjects. Here are some tips for interpreting Friedman's Test . 

What are simple, main, and interaction effects in ANOVA?

Consider the two-way ANOVA model setup that contains two different kinds of effects to evaluate:

The 𝛼 and 𝛽 factors are “main” effects, which are the isolated effect of a given factor. “Main effect” is used interchangeably with “simple effect” in some textbooks.

The interaction term is denoted as “𝛼𝛽”, and it allows for the effect of a factor to depend on the level of another factor. It can only be tested when you have replicates in your study. Otherwise, the error term is assumed to be the interaction term.

What are multiple comparisons?

When you’re doing multiple statistical tests on the same set of data, there’s a greater propensity to discover statistically significant differences that aren’t true differences. Multiple comparison corrections attempt to control for this, and in general control what is called the familywise error rate. There are a number of multiple comparison testing methods , which all have pros and cons depending on your particular experimental design and research questions.

What does the word “way” mean in one-way vs two-way ANOVA?

In statistics overall, it can be hard to keep track of factors, groups, and tails. To the untrained eye “two-way ANOVA” could mean any of these things.

The best way to think about ANOVA is in terms of factors or variables in your experiment. Suppose you have one factor in your analysis (perhaps “treatment”). You will likely see that written as a one-way ANOVA. Even if that factor has several different treatment groups, there is only one factor, and that’s what drives the name. 

Also, “way” has absolutely nothing to do with “tails” like a t-test. ANOVA relies on F tests, which can only test for equal vs unequal because they rely on squared terms. So ANOVA does not have the “one-or-two tails” question .

What is the difference between ANOVA and a t-test?

ANOVA is an extension of the t-test. If you only have two group means to compare, use a t-test. Anything more requires ANOVA.

What is the difference between ANOVA and chi-square?

Chi-square is designed for contingency tables, or counts of items within groups (e.g., type of animal). The goal is to see whether the counts in a particular sample match the counts you would expect by random chance.

ANOVA separates subjects into groups for evaluation, but there is some numeric response variable of interest (e.g., glucose level).

Can ANOVA evaluate effects on multiple response variables at the same time?

Multiple response variables makes things much more complicated than multiple factors. ANOVA (as we’ve discussed it here) can obviously handle multiple factors but it isn’t designed for tracking more than one response at a time. 

Technically, there is an expansion approach designed for this called Multivariate (or Multiple) ANOVA, or more commonly written as MANOVA. Things get complicated quickly, and in general requires advanced training.

Can ANOVA evaluate numeric factors in addition to the usual categorical factors?

It sounds like you are looking for ANCOVA (analysis of covariance). You can treat a continuous (numeric) factor as categorical, in which case you could use ANOVA, but this is a common point of confusion .

What is the definition of ANOVA?

ANOVA stands for analysis of variance, and, true to its name, it is a statistical technique that analyzes how experimental factors influence the variance in the response variable from an experiment.

What is blocking in Anova?

Blocking is an incredibly powerful and useful strategy in experimental design when you have a factor that you think will heavily influence the outcome, so you want to control for it in your experiment. Blocking affects how the randomization is done with the experiment. Usually blocking variables are nuisance variables that are important to control for but are not inherently of interest. 

A simple example is an experiment evaluating the efficacy of a medical drug and blocking by age of the subject. To do blocking, you must first gather the ages of all of the participants in the study, appropriately bin them into groups (e.g., 10-30, 30-50, etc.), and then randomly assign an equal number of treatments to the subjects within each group.

There’s an entire field of study around blocking. Some examples include having multiple blocking variables, incomplete block designs where not all treatments appear in all blocks, and balanced (or unbalanced) blocking designs where equal (or unequal) numbers of replicates appear in each block and treatment combination.

What is ANOVA in statistics?

For a one-way ANOVA test, the overall ANOVA null hypothesis is that the mean responses are equal for all treatments. The ANOVA p-value comes from an F-test.

Can I do ANOVA in R?

While Prism makes ANOVA much more straightforward, you can use open-source coding languages like R as well. Here are some examples of R code for repeated measures ANOVA, both one-way ANOVA in R and two-way ANOVA in R .

Perform your own ANOVA

Are you ready for your own Analysis of variance? Prism makes choosing the correct ANOVA model simple and transparent .

Start your 30 day free trial of Prism   and get access to:

  • A step by step guide on how to perform ANOVA
  • Sample data to save you time
  • More tips on how Prism can help your research

With Prism, in a matter of minutes you learn how to go from entering data to performing statistical analyses and generating high-quality graphs.

Teach yourself statistics

One-Way Analysis of Variance: Example

In this lesson, we apply one-way analysis of variance to some fictitious data, and we show how to interpret the results of our analysis.

Note: Computations for analysis of variance are usually handled by a software package. For this example, however, we will do the computations "manually", since the gory details have educational value.

Problem Statement

A pharmaceutical company conducts an experiment to test the effect of a new cholesterol medication. The company selects 15 subjects randomly from a larger population. Each subject is randomly assigned to one of three treatment groups. Within each treament group, subjects receive a different dose of the new medication. In Group 1, subjects receive 0 mg/day; in Group 2, 50 mg/day; and in Group 3, 100 mg/day.

The treatment levels represent all the levels of interest to the experimenter, so this experiment used a fixed-effects model to select treatment levels for study.

After 30 days, doctors measure the cholesterol level of each subject. The results for all 15 subjects appear in the table below:

Dosage
Group 1,
0 mg
Group 2,
50 mg
Group 3,
100 mg
210 210 180
240 240 210
270 240 210
270 270 210
300 270 240

In conducting this experiment, the experimenter had two research questions:

  • Does dosage level have a significant effect on cholesterol level?
  • How strong is the effect of dosage level on cholesterol level?

To answer these questions, the experimenter intends to use one-way analysis of variance.

Is One-Way ANOVA the Right Technique?

Before you crunch the first number in one-way analysis of variance, you must be sure that one-way analysis of variance is the correct technique. That means you need to ask two questions:

  • Is the experimental design compatible with one-way analysis of variance?
  • Does the data set satisfy the critical assumptions required for one-way analysis of variance?

Let's address both of those questions.

Experimental Design

As we discussed in the previous lesson (see One-Way Analysis of Variance: Fixed Effects ), one-way analysis of variance is only appropriate with one experimental design - a completely randomized design. That is exactly the design used in our cholesterol study, so we can check the experimental design box.

Critical Assumptions

We also learned in the previous lesson that one-way analysis of variance makes three critical assumptions:

  • Independence . The dependent variable score for each experimental unit is independent of the score for any other unit.
  • Normality . In the population, dependent variable scores are normally distributed within treatment groups.
  • Equality of variance . In the population, the variance of dependent variable scores in each treatment group is equal. (Equality of variance is also known as homogeneity of variance or homoscedasticity.)

Therefore, for the cholesterol study, we need to make sure our data set is consistent with the critical assumptions.

Independence of Scores

The assumption of independence is the most important assumption. When that assumption is violated, the resulting statistical tests can be misleading.

The independence assumption is satisfied by the design of the study, which features random selection of subjects and random assignment to treatment groups. Randomization tends to distribute effects of extraneous variables evenly across groups.

Normal Distributions in Groups

Violations of normality can be a problem when sample size is small, as it is in this cholesterol study. Therefore, it is important to be on the lookout for any indication of non-normality.

There are many different ways to check for normality. On this website, we describe three at: How to Test for Normality: Three Simple Tests . Given the small sample size, our best option for testing normality is to look at the following descriptive statistics:

  • Central tendency. The mean and the median are summary measures used to describe central tendency - the most "typical" value in a set of values. With a normal distribution, the mean is equal to the median.
  • Skewness. Skewness is a measure of the asymmetry of a probability distribution. If observations are equally distributed around the mean, the skewness value is zero; otherwise, the skewness value is positive or negative. As a rule of thumb, skewness between -2 and +2 is consistent with a normal distribution.
  • Kurtosis. Kurtosis is a measure of whether observations cluster around the mean of the distribution or in the tails of the distribution. The normal distribution has a kurtosis value of zero. As a rule of thumb, kurtosis between -2 and +2 is consistent with a normal distribution.

The table below shows the mean, median, skewness, and kurtosis for each group from our study.

  Group 1,
0 mg
Group 2,
50 mg
Group 3,
100 mg
Mean 258 246 210
Median 270 240 210
Range 90 60 60
Skewness -0.40 -0.51 0.00
Kurtosis -0.18 -0.61 2.00

In all three groups, the difference between the mean and median looks small (relative to the range ). And skewness and kurtosis measures are consistent with a normal distribution (i.e., between -2 and +2). These are crude tests, but they provide some confidence for the assumption of normality in each group.

Note: With Excel, you can easily compute the descriptive statistics in Table 1. To see how, go to: How to Test for Normality: Example 1 .

Homogeneity of Variance

When the normality of variance assumption is satisfied, you can use Hartley's Fmax test to test for homogeneity of variance. Here's how to implement the test:

Σj=1 - X )
s =
( n - 1 )

where X i, j is the score for observation i in Group j , X j is the mean of Group j , and n j is the number of observations in Group j .

Here is the variance ( s 2 j ) for each group in the cholesterol study.

Group 1,
0 mg
Group 2,
50 mg
Group 3,
100 mg
1170 630 450

F RATIO = s 2 MAX / s 2 MIN

F RATIO = 1170 / 450

F RATIO = 2.6

where s 2 MAX is the largest group variance, and s 2 MIN is the smallest group variance.

where n is the largest sample size in any group.

Note: The critical F values in the table are based on a significance level of 0.05.

Here, the F ratio (2.6) is smaller than the Fmax value (15.5), so we conclude that the variances are homogeneous.

Note: Other tests, such as Bartlett's test , can also test for homogeneity of variance. For the record, Bartlett's test yields the same conclusion for the cholesterol study; namely, the variances are homogeneous.

Analysis of Variance

Having confirmed that the critical assumptions are tenable, we can proceed with a one-way analysis of variance. That means taking the following steps:

  • Specify a mathematical model to describe the causal factors that affect the dependent variable.
  • Write statistical hypotheses to be tested by experimental data.
  • Specify a significance level for a hypothesis test.
  • Compute the grand mean and the mean scores for each group.
  • Compute sums of squares for each effect in the model.
  • Find the degrees of freedom associated with each effect in the model.
  • Based on sums of squares and degrees of freedom, compute mean squares for each effect in the model.
  • Compute a test statistic , based on observed mean squares and their expected values.
  • Find the P value for the test statistic.
  • Accept or reject the null hypothesis , based on the P value and the significance level.
  • Assess the magnitude of the effect of the independent variable, based on sums of squares.

Now, let's execute each step, one-by-one, with our cholesterol medication experiment.

Mathematical Model

For every experimental design, there is a mathematical model that accounts for all of the independent and extraneous variables that affect the dependent variable. In our experiment, the dependent variable ( X ) is the cholesterol level of a subject, and the independent variable ( β ) is the dosage level administered to a subject.

For example, here is the fixed-effects model for a completely randomized design:

X i j = μ + β j + ε i ( j )

where X i j is the cholesterol level for subject i in treatment group j , μ is the population mean, β j is the effect of the dosage level administered to subjects in group j ; and ε i ( j ) is the effect of all other extraneous variables on subject i in treatment j .

Statistical Hypotheses

For fixed-effects models, it is common practice to write statistical hypotheses in terms of the treatment effect β j . With that in mind, here is the null hypothesis and the alternative hypothesis for a one-way analysis of variance:

H 0 : β j = 0 for all j

H 1 : β j ≠ 0 for some j

If the null hypothesis is true, the mean score (i.e., mean cholesterol level) in each treatment group should equal the population mean. Thus, if the null hypothesis is true, mean scores in the k treatment groups should be equal. If the null hypothesis is false, at least one pair of mean scores should be unequal.

Significance Level

The significance level (also known as alpha or α) is the probability of rejecting the null hypothesis when it is actually true. The significance level for an experiment is specified by the experimenter, before data collection begins.

Experimenters often choose significance levels of 0.05 or 0.01. For this experiment, let's use a significance level of 0.05.

Mean Scores

Analysis of variance begins by computing a grand mean and group means:

X  = ( 1 / 15 ) * ( 210 + 210 + ... + 270 + 240 )

  • Group means. The mean of group j ( X j ) is the mean of all observations in group j , computed as follows:

X  1  = 258

X  2  = 246

X  3  = 210

In the equations above, n is the total sample size across all groups; and n  j is the sample size in Group j  .

Sums of Squares

A sum of squares is the sum of squared deviations from a mean score. One-way analysis of variance makes use of three sums of squares:

SSB = 5 * [ ( 238-258 ) 2 + ( 238-246) 2 + ( 238-210 ) 2 ]

SSW = 2304 + ... + 900 = 9000

  • Total sum of squares. The total sum of squares (SST) measures variation of all scores around the grand mean. It can be computed from the following formula: SST = k Σ j=1 n j Σ i=1 ( X  i j  -  X  ) 2

SST = 784 + 4 + 1084 + ... + 784 + 784 + 4

SST = 15,240

It turns out that the total sum of squares is equal to the between-groups sum of squares plus the within-groups sum of squares, as shown below:

SST = SSB + SSW

15,240 = 6240 + 9000

Degrees of Freedom

The term degrees of freedom (df) refers to the number of independent sample points used to compute a statistic minus the number of parameters estimated from the sample points.

To illustrate what is going on, let's find the degrees of freedom associated with the various sum of squares computations:

Here, the formula uses k independent sample points, the sample means X   j  . And it uses one parameter estimate, the grand mean X , which was estimated from the sample points. So, the between-groups sum of squares has k - 1 degrees of freedom ( df BG  ).

df BG = k - 1 = 5 - 1 = 4

Here, the formula uses n independent sample points, the individual subject scores X  i j  . And it uses k parameter estimates, the group means X   j  , which were estimated from the sample points. So, the within-groups sum of squares has n - k degrees of freedom ( df WG  ).

n = Σ n i = 5 + 5 + 5 = 15

df WG = n - k = 15 - 3 = 12

Here, the formula uses n independent sample points, the individual subject scores X  i j  . And it uses one parameter estimate, the grand mean X , which was estimated from the sample points. So, the total sum of squares has n  - 1 degrees of freedom ( df TOT  ).

df TOT  = n - 1 = 15 - 1 = 14

The degrees of freedom for each sum of squares are summarized in the table below:

Sum of squares Degrees of freedom
Between-groups k - 1 = 2
Within-groups n - k =12
Total n - 1 = 14

Mean Squares

A mean square is an estimate of population variance. It is computed by dividing a sum of squares (SS) by its corresponding degrees of freedom (df), as shown below:

MS = SS / df

To conduct a one-way analysis of variance, we are interested in two mean squares:

MS WG = SSW / df WG

MS WG = 9000 / 12 = 750

MS BG = SSB / df BG

MS BG = 6240 / 2 = 3120

Expected Value

The expected value of a mean square is the average value of the mean square over a large number of experiments.

Statisticians have derived formulas for the expected value of the within-groups mean square ( MS WG  ) and for the expected value of the between-groups mean square ( MS BG  ). For one-way analysis of variance, the expected value formulas are:

Fixed- and Random-Effects:

E( MS WG  ) = σ ε 2

Fixed-Effects:

Σj=1
E( MS  ) = σ +
( k - 1 )

Random-Effects:

E( MS BG  ) = σ ε 2 + nσ β 2

In the equations above, E( MS WG  ) is the expected value of the within-groups mean square; E( MS BG  ) is the expected value of the between-groups mean square; n is total sample size; k is the number of treatment groups; β  j is the treatment effect in Group j ; σ ε 2 is the variance attributable to everything except the treatment effect (i.e., all the extraneous variables); and σ β 2 is the variance due to random selection of treatment levels.

Notice that MS BG should equal MS WG when the variation due to treatment effects ( β  j for fixed effects and σ β 2 for random effects) is zero (i.e., when the independent variable does not affect the dependent variable). And MS BG should be bigger than the MS WG when the variation due to treatment effects is not zero (i.e., when the independent variable does affect the dependent variable)

Conclusion: By examining the relative size of the mean squares, we can make a judgment about whether an independent variable affects a dependent variable.

Test Statistic

Suppose we use the mean squares to define a test statistic F as follows:

F(v 1 , v 2 ) = MS BG / MS WG

F(2, 12) = 3120 / 750 = 4.16

where MS BG is the between-groups mean square, MS WG is the within-groups mean square, v 1 is the degrees of freedom for MS BG , and v 2 is the degrees of freedom for MS WG .

Defined in this way, the F ratio measures the size of MS BG relative to MS WG . The F ratio is a convenient measure that we can use to test the null hypothesis. Here's how:

  • When the F ratio is close to one, MS BG is approximately equal to MS WG . This indicates that the independent variable did not affect the dependent variable, so we cannot reject the null hypothesis.
  • When the F ratio is significantly greater than one, MS BG is bigger than MS WG . This indicates that the independent variable did affect the dependent variable, so we must reject the null hypothesis.

What does it mean for the F ratio to be significantly greater than one? To answer that question, we need to talk about the P-value.

In an experiment, a P-value is the probability of obtaining a result more extreme than the observed experimental outcome, assuming the null hypothesis is true.

With analysis of variance, the F ratio is the observed experimental outcome that we are interested in. So, the P-value would be the probability that an F statistic would be more extreme (i.e., bigger) than the actual F ratio computed from experimental data.

We can use Stat Trek's F Distribution Calculator to find the probability that an F statistic will be bigger than the actual F ratio observed in the experiment. Enter the between-groups degrees of freedom (2), the within-groups degrees of freedom (12), and the observed F ratio (4.16) into the calculator; then, click the Calculate button.

From the calculator, we see that the P ( F > 4.16 ) equals about 0.04. Therefore, the P-Value is 0.04.

Hypothesis Test

Recall that we specified a significance level 0.05 for this experiment. Once you know the significance level and the P-value, the hypothesis test is routine. Here's the decision rule for accepting or rejecting the null hypothesis:

  • If the P-value is bigger than the significance level, accept the null hypothesis.
  • If the P-value is equal to or smaller than the significance level, reject the null hypothesis.

Since the P-value (0.04) in our experiment is smaller than the significance level (0.05), we reject the null hypothesis that drug dosage had no effect on cholesterol level. And we conclude that the mean cholesterol level in at least one treatment group differed significantly from the mean cholesterol level in another group.

Magnitude of Effect

The hypothesis test tells us whether the independent variable in our experiment has a statistically significant effect on the dependent variable, but it does not address the magnitude of the effect. Here's the issue:

  • When the sample size is large, you may find that even small differences in treatment means are statistically significant.
  • When the sample size is small, you may find that even big differences in treatment means are not statistically significant.

With this in mind, it is customary to supplement analysis of variance with an appropriate measure of effect size. Eta squared (η 2 ) is one such measure. Eta squared is the proportion of variance in the dependent variable that is explained by a treatment effect. The eta squared formula for one-way analysis of variance is:

η 2 = SSB / SST

where SSB is the between-groups sum of squares and SST is the total sum of squares.

Given this formula, we can compute eta squared for this drug dosage experiment, as shown below:

η 2 = SSB / SST = 6240 / 15240 = 0.41

Thus, 41 percent of the variance in our dependent variable (cholesterol level) can be explained by variation in our independent variable (dosage level). It appears that the relationship between dosage level and cholesterol level is significant not only in a statistical sense; it is significant in a practical sense as well.

ANOVA Summary Table

It is traditional to summarize ANOVA results in an analysis of variance table. The analysis that we just conducted provides all of the information that we need to produce the following ANOVA summary table:

Analysis of Variance Table

Source SS df MS F P
BG 6,240 2 3,120 4.16 0.04
WG 9,000 12 750
Total 15,240 14

This ANOVA table allows any researcher to interpret the results of the experiment, at a glance.

The P-value (shown in the last column of the ANOVA table) is the probability that an F statistic would be more extreme (bigger) than the F ratio shown in the table, assuming the null hypothesis is true. When the P-value is bigger than the significance level, we accept the null hypothesis; when it is smaller, we reject it. Here, the P-value (0.04) is smaller than the significance level (0.05), so we reject the null hypothesis.

To assess the strength of the treatment effect, an experimenter might compute eta squared (η 2 ). The computation is easy, using sum of squares entries from the ANOVA table, as shown below:

η 2 = SSB / SST = 6,240 / 15,240 = 0.41

For this experiment, an eta squared of 0.41 means that 41% of the variance in the dependent variable can be explained by the effect of the independent variable.

An Easier Option

In this lesson, we showed all of the hand calculations for a one-way analysis of variance. In the real world, researchers seldom conduct analysis of variance by hand. They use statistical software. In the next lesson, we'll analyze data from this problem with Excel. Hopefully, we'll get the same result.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Two-Way ANOVA | Examples & When To Use It

Two-Way ANOVA | Examples & When To Use It

Published on March 20, 2020 by Rebecca Bevans . Revised on June 22, 2023.

ANOVA (Analysis of Variance) is a statistical test used to analyze the difference between the means of more than two groups.

A two-way ANOVA is used to estimate how the mean of a quantitative variable changes according to the levels of two categorical variables. Use a two-way ANOVA when you want to know how two independent variables, in combination, affect a dependent variable.

Table of contents

When to use a two-way anova, how does the anova test work, assumptions of the two-way anova, how to perform a two-way anova, interpreting the results of a two-way anova, how to present the results of a a two-way anova, other interesting articles, frequently asked questions about two-way anova.

You can use a two-way ANOVA when you have collected data on a quantitative dependent variable at multiple levels of two categorical independent variables.

A quantitative variable represents amounts or counts of things. It can be divided to find a group mean.

A categorical variable represents types or categories of things. A level is an individual category within the categorical variable.

You should have enough observations in your data set to be able to find the mean of the quantitative dependent variable at each combination of levels of the independent variables.

Both of your independent variables should be categorical. If one of your independent variables is categorical and one is quantitative, use an ANCOVA instead.

Prevent plagiarism. Run a free check.

ANOVA tests for significance using the F test for statistical significance . The F test is a groupwise comparison test, which means it compares the variance in each group mean to the overall variance in the dependent variable.

If the variance within groups is smaller than the variance between groups, the F test will find a higher F value, and therefore a higher likelihood that the difference observed is real and not due to chance.

A two-way ANOVA with interaction tests three null hypotheses at the same time:

  • There is no difference in group means at any level of the first independent variable.
  • There is no difference in group means at any level of the second independent variable.
  • The effect of one independent variable does not depend on the effect of the other independent variable (a.k.a. no interaction effect).

A two-way ANOVA without interaction (a.k.a. an additive two-way ANOVA) only tests the first two of these hypotheses.

Null hypothesis (H ) Alternate hypothesis (H )
There is no difference in average yield
for any fertilizer type.
There is a difference in average yield by fertilizer type.
There is no difference in average yield at either planting density. There is a difference in average yield by planting density.
The effect of one independent variable on average yield does not depend on the effect of the other independent variable (a.k.a. no interaction effect). There is an interaction effect between planting density and fertilizer type on average yield.

To use a two-way ANOVA your data should meet certain assumptions.Two-way ANOVA makes all of the normal assumptions of a parametric test of difference:

  • Homogeneity of variance (a.k.a. homoscedasticity )

The variation around the mean for each group being compared should be similar among all groups. If your data don’t meet this assumption, you may be able to use a non-parametric alternative , like the Kruskal-Wallis test.

  • Independence of observations

Your independent variables should not be dependent on one another (i.e. one should not cause the other). This is impossible to test with categorical variables – it can only be ensured by good experimental design .

In addition, your dependent variable should represent unique observations – that is, your observations should not be grouped within locations or individuals.

If your data don’t meet this assumption (i.e. if you set up experimental treatments within blocks), you can include a blocking variable and/or use a repeated-measures ANOVA.

  • Normally-distributed dependent variable

The values of the dependent variable should follow a bell curve (they should be normally distributed ). If your data don’t meet this assumption, you can try a data transformation.

The dataset from our imaginary crop yield experiment includes observations of:

  • Final crop yield (bushels per acre)
  • Type of fertilizer used (fertilizer type 1, 2, or 3)
  • Planting density (1=low density, 2=high density)
  • Block in the field (1, 2, 3, 4).

The two-way ANOVA will test whether the independent variables (fertilizer type and planting density) have an effect on the dependent variable (average crop yield). But there are some other possible sources of variation in the data that we want to take into account.

We applied our experimental treatment in blocks, so we want to know if planting block makes a difference to average crop yield. We also want to check if there is an interaction effect between two independent variables – for example, it’s possible that planting density affects the plants’ ability to take up fertilizer.

Because we have a few different possible relationships between our variables, we will compare three models:

  • A two-way ANOVA without any interaction or blocking variable (a.k.a an additive two-way ANOVA).
  • A two-way ANOVA with interaction but with no blocking variable.
  • A two-way ANOVA with interaction and with the blocking variable.

Model 1 assumes there is no interaction between the two independent variables. Model 2 assumes that there is an interaction between the two independent variables. Model 3 assumes there is an interaction between the variables, and that the blocking variable is an important source of variation in the data.

By running all three versions of the two-way ANOVA with our data and then comparing the models, we can efficiently test which variables, and in which combinations, are important for describing the data, and see whether the planting block matters for average crop yield.

This is not the only way to do your analysis, but it is a good method for efficiently comparing models based on what you think are reasonable combinations of variables.

Running a two-way ANOVA in R

We will run our analysis in R. To try it yourself, download the sample dataset.

Sample dataset for a two-way ANOVA

After loading the data into the R environment, we will create each of the three models using the aov() command, and then compare them using the aictab() command. For a full walkthrough, see our guide to ANOVA in R .

This first model does not predict any interaction between the independent variables, so we put them together with a ‘+’.

In the second model, to test whether the interaction of fertilizer type and planting density influences the final yield, use a ‘ * ‘ to specify that you also want to know the interaction effect.

Because our crop treatments were randomized within blocks, we add this variable as a blocking factor in the third model. We can then compare our two-way ANOVAs with and without the blocking variable to see whether the planting location matters.

Model comparison

Now we can find out which model is the best fit for our data using AIC ( Akaike information criterion ) model selection.

AIC calculates the best-fit model by finding the model that explains the largest amount of variation in the response variable while using the fewest parameters. We can perform a model comparison in R using the aictab() function.

The output looks like this:

AIC model selection table, with best model listed first

The AIC model with the best fit will be listed first, with the second-best listed next, and so on. This comparison reveals that the two-way ANOVA without any interaction or blocking effects is the best fit for the data.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

You can view the summary of the two-way model in R using the summary() command. We will take a look at the results of the first model, which we found was the best fit for our data.

Model summary of a two-way ANOVA without interaction in R.

The model summary first lists the independent variables being tested (‘fertilizer’ and ‘density’). Next is the residual variance (‘Residuals’), which is the variation in the dependent variable that isn’t explained by the independent variables.

The following columns provide all of the information needed to interpret the model:

  • Df shows the degrees of freedom for each variable (number of levels in the variable minus 1).
  • Sum sq is the sum of squares (a.k.a. the variation between the group means created by the levels of the independent variable and the overall mean).
  • Mean sq shows the mean sum of squares (the sum of squares divided by the degrees of freedom).
  • F value is the test statistic from the F test (the mean square of the variable divided by the mean square of each parameter).
  • Pr(>F) is the p value of the F statistic, and shows how likely it is that the F value calculated from the F test would have occurred if the null hypothesis of no difference was true.

From this output we can see that both fertilizer type and planting density explain a significant amount of variation in average crop yield ( p values < 0.001).

Post-hoc testing

ANOVA will tell you which parameters are significant, but not which levels are actually different from one another. To test this we can use a post-hoc test. The Tukey’s Honestly-Significant-Difference (TukeyHSD) test lets us see which groups are different from one another.

Summary of a TukeyHSD post-hoc comparison for a two-way ANOVA in R.

This output shows the pairwise differences between the three types of fertilizer ($fertilizer) and between the two levels of planting density ($density), with the average difference (‘diff’), the lower and upper bounds of the 95% confidence interval (‘lwr’ and ‘upr’) and the p value of the difference (‘p-adj’).

From the post-hoc test results, we see that there are significant differences ( p < 0.05) between:

  • fertilizer groups 3 and 1,
  • fertilizer types 3 and 2,
  • the two levels of planting density,

but no difference between fertilizer groups 2 and 1.

Once you have your model output, you can report the results in the results section of your thesis , dissertation or research paper .

When reporting the results you should include the F statistic, degrees of freedom, and p value from your model output.

You can discuss what these findings mean in the discussion section of your paper.

You may also want to make a graph of your results to illustrate your findings.

Your graph should include the groupwise comparisons tested in the ANOVA, with the raw data points, summary statistics (represented here as means and standard error bars), and letters or significance values above the groups to show which groups are significantly different from the others.

Groupwise comparisons graph illustrating the results of a two-way ANOVA.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square test of independence
  • Statistical power
  • Descriptive statistics
  • Degrees of freedom
  • Pearson correlation
  • Null hypothesis

Methodology

  • Double-blind study
  • Case-control study
  • Research ethics
  • Data collection
  • Hypothesis testing
  • Structured interviews

Research bias

  • Hawthorne effect
  • Unconscious bias
  • Recall bias
  • Halo effect
  • Self-serving bias
  • Information bias

The only difference between one-way and two-way ANOVA is the number of independent variables . A one-way ANOVA has one independent variable, while a two-way ANOVA has two.

  • One-way ANOVA : Testing the relationship between shoe brand (Nike, Adidas, Saucony, Hoka) and race finish times in a marathon.
  • Two-way ANOVA : Testing the relationship between shoe brand (Nike, Adidas, Saucony, Hoka), runner age group (junior, senior, master’s), and race finishing times in a marathon.

All ANOVAs are designed to test for differences among three or more groups. If you are only testing for a difference between two groups, use a t-test instead.

In ANOVA, the null hypothesis is that there is no difference among group means. If any group differs significantly from the overall group mean, then the ANOVA will report a statistically significant result.

Significant differences among group means are calculated using the F statistic, which is the ratio of the mean sum of squares (the variance explained by the independent variable) to the mean square error (the variance left over).

If the F statistic is higher than the critical value (the value of F that corresponds with your alpha value, usually 0.05), then the difference among groups is deemed statistically significant.

A factorial ANOVA is any ANOVA that uses more than one categorical independent variable . A two-way ANOVA is a type of factorial ANOVA.

Some examples of factorial ANOVAs include:

  • Testing the combined effects of vaccination (vaccinated or not vaccinated) and health status (healthy or pre-existing condition) on the rate of flu infection in a population.
  • Testing the effects of marital status (married, single, divorced, widowed), job status (employed, self-employed, unemployed, retired), and family history (no family history, some family history) on the incidence of depression in a population.
  • Testing the effects of feed type (type A, B, or C) and barn crowding (not crowded, somewhat crowded, very crowded) on the final weight of chickens in a commercial farming operation.

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Two-Way ANOVA | Examples & When To Use It. Scribbr. Retrieved June 24, 2024, from https://www.scribbr.com/statistics/two-way-anova/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, anova in r | a complete step-by-step guide with examples, one-way anova | when and how to use it (with examples), what is your plagiarism score.

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

One Way ANOVA Overview & Example

By Jim Frost Leave a Comment

What is One Way ANOVA?

Use one way ANOVA to compare the means of three or more groups. This analysis is an inferential hypothesis test that uses samples to draw conclusions about populations. Specifically, it tells you whether your sample provides sufficient evidence to conclude that the groups’ population means are different. ANOVA stands for analysis of variance.

To perform one-way ANOVA, you’ll need a continuous dependent (outcome) variable and a categorical independent variable to form the groups.

For example, one-way ANOVA can determine whether parts made from four materials have different mean strengths.

In this post, learn about the hypotheses, assumptions, and interpreting the results for one-way ANOVA.

Related post : Descriptive vs. Inferential Statistics and Independent and Dependent Variables .

One Way ANOVA Hypotheses

One-way ANOVA has the following hypotheses:

  • Null hypothesis: All population group means are equal.
  • Alternative hypothesis : Not all population group means are equal.

Reject the null when your p-value is less than your significance level (e.g., 0.05). The differences between the means are statistically significant. Your sample provides sufficiently strong evidence to conclude that the population means are not all equal.

Note that one-way ANOVA is an omnibus test, providing overall results for your data. It tells you whether any group means are different—Yes or No. However, it doesn’t specify which pairs of means are different. To make that determination, follow up a statistically significant one-way ANOVA with a post hoc test that can identify specific group differences that are significant.

Related posts : Interpreting P Values and Null Hypothesis Definition .

One Way ANOVA Assumptions

For reliable one-way ANOVA results, your data should satisfy the following assumptions:

Use valid sampling methods

Use random sampling to help ensure your sample represents your target population. If your data do not reflect the population, your one-way ANOVA results will not be valid.

Additionally, the method assumes your sampling method obtains independent observations. Selecting one subject does not affect the chances of choosing any others.

Finally, the procedure uses independent samples. Each group contains a unique set of items.

Related posts : Representative Samples: Definition, Uses & Examples and Independent and Dependent Samples

Continuous data

One-way ANOVA requires continuous data . Typically, you quantity continuous variables using a scale that can be meaningfully divided into smaller fractions. For example, temperature, mass, length, and duration are continuous data.

Learn more about Hypothesis Tests for Continuous, Binary, and Count Data .

Data follows a normal distribution or each group has at least 15-20 observations

One-way ANOVA assumes your group data follow the normal distribution . However, your groups can be skewed if your sample size is large enough because of the central limit theorem.

Here are the sample size guidelines:

  • 2 – 9 groups: At least 15 in each group.
  • 10 – 12 groups: At least 20 per group.

For one-way ANOVA, unimodal data can be mildly skewed and the results will still be valid when all groups exceed the guidelines. Read here for more information about the simulation studies that support these sample size guidelines.

However, if your sample size is smaller, graph your data and determine whether the groups are skewed. If they are, you might need to use a nonparametric test . The Kruskal-Wallis test is the nonparametric test corresponding to one-way ANOVA.

Be sure to look for outliers because they can produce misleading results.

Related posts : Central Limit Theorem & Skewed Distributions

Groups can have equal or unequal variances but use the correct form of the test

One-way ANOVA has two methods for handling group variances. The traditional F-test ANOVA assumes that all groups have equal variances. On the other hand, Welch’s ANOVA does not assume they are equal. If in doubt, just use Welch’s ANOVA because it works well for either case.

Related posts : Variances and Standard Deviations

One Way ANOVA Example

Suppose we are a manufacturer testing four materials to make a part. We collect a random sample of parts made using the four materials and measure their strengths. Download the CSV dataset for this example: PostHocTests .

First, I’ll graph the data to see what we’re working with.

Bar chart of group means for our one-way ANOVA.

The bar chart shows differences between the group means. However, a graph doesn’t indicate whether those differences are due to chance during random sampling or reflect underlying population differences. One-way ANOVA can help us out with that!

Let’s use one-way ANOVA to determine whether the mean differences between these groups are statistically significant. Below are the statistical results.

Example of statistical output for one-way ANOVA.

The p-value of 0.004 is less than our significance level of 0.05. We reject the null and conclude that all four population means are not all equal. While the Means table shows the group means at the bottom, we don’t know which differences between pairs of groups are statistically significant.

To perform pairwise comparisons between these four groups, we need to use a post hoc test, also known as multiple comparisons. To continue with this example and find the significant group differences, read my post Using Post Hoc Tests with ANOVA .

Related posts : How to do One-Way ANOVA in Excel

Share this:

example anova research question

Reader Interactions

Comments and questions cancel reply.

example anova research question

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

Lesson 10: introduction to anova, overview section  .

In the previous lessons, we learned how to perform inference for a population mean from one sample and also how to compare population means from two samples (independent and paired). In this Lesson, we introduce Analysis of Variance or ANOVA. ANOVA is a statistical method that analyzes variances to determine if the means from more than two populations are the same. In other words, we have a quantitative response variable and a categorical explanatory variable with more than two levels. In ANOVA, the categorical explanatory is typically referred to as the factor.

  • Describe the logic behind analysis of variance.
  • Set up and perform one-way ANOVA.
  • Identify the information in the ANOVA table.
  • Interpret the results from ANOVA output.
  • Perform multiple comparisons and interpret the results, when appropriate.

What Is An ANOVA Test In Statistics: Analysis Of Variance

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

On This Page:

An ANOVA test is a statistical test used to determine if there is a statistically significant difference between two or more categorical groups by testing for differences of means using a variance.

Another key part of ANOVA is that it splits the independent variable into two or more groups.

For example, one or more groups might be expected to influence the dependent variable, while the other group is used as a control group and is not expected to influence the dependent variable.

Assumptions of ANOVA

The assumptions of the ANOVA test are the same as the general assumptions for any parametric test:

  • An ANOVA can only be conducted if there is no relationship between the subjects in each sample. This means that subjects in the first group cannot also be in the second group (e.g., independent samples/between groups).
  • The different groups/levels must have equal sample sizes .
  • An ANOVA can only be conducted if the dependent variable is normally distributed so that the middle scores are the most frequent and the extreme scores are the least frequent.
  • Population variances must be equal (i.e., homoscedastic). Homogeneity of variance means that the deviation of scores (measured by the range or standard deviation, for example) is similar between populations.

Types of ANOVA Tests

There are different types of ANOVA tests. The two most common are a “One-Way” and a “Two-Way.”

The difference between these two types depends on the number of independent variables in your test.

One-way ANOVA

A one-way ANOVA (analysis of variance) has one categorical independent variable (also known as a factor) and a normally distributed continuous (i.e., interval or ratio level) dependent variable.

The independent variable divides cases into two or more mutually exclusive levels, categories, or groups.

The one-way ANOVA test for differences in the means of the dependent variable is broken down by the levels of the independent variable.

An example of a one-way ANOVA includes testing a therapeutic intervention (CBT, medication, placebo) on the incidence of depression in a clinical sample.

Note : Both the One-Way ANOVA and the Independent Samples t-Test can compare the means for two groups. However, only the One-Way ANOVA can compare the means across three or more groups.

Two-way (factorial) ANOVA

A two-way ANOVA (analysis of variance) has two or more categorical independent variables (also known as a factor) and a normally distributed continuous (i.e., interval or ratio level) dependent variable.

The independent variables divide cases into two or more mutually exclusive levels, categories, or groups. A two-way ANOVA is also called a factorial ANOVA.

An example of factorial ANOVAs include testing the effects of social contact (high, medium, low), job status (employed, self-employed, unemployed, retired), and family history (no family history, some family history) on the incidence of depression in a population.

What are “Groups” or “Levels”?

In ANOVA, “groups” or “levels” refer to the different categories of the independent variable being compared.

For example, if the independent variable is “eggs,” the levels might be Non-Organic, Organic, and Free Range Organic. The dependent variable could then be the price per dozen eggs.

ANOVA F -value

The test statistic for an ANOVA is denoted as F . The formula for ANOVA is F = variance caused by treatment/variance due to random chance.

The ANOVA F value can tell you if there is a significant difference between the levels of the independent variable, when p < .05. So, a higher F value indicates that the treatment variables are significant.

Note that the ANOVA alone does not tell us specifically which means were different from one another. To determine that, we would need to follow up with multiple comparisons (or post-hoc) tests.

When the initial F test indicates that significant differences exist between group means, post hoc tests are useful for determining which specific means are significantly different when you do not have specific hypotheses that you wish to test.

Post hoc tests compare each pair of means (like t-tests), but unlike t-tests, they correct the significance estimate to account for the multiple comparisons.

What Does “Replication” Mean?

Replication requires a study to be repeated with different subjects and experimenters. This would enable a statistical analyzer to confirm a prior study by testing the same hypothesis with a new sample.

How to run an ANOVA?

For large datasets, it is best to run an ANOVA in statistical software such as R or Stata. Let’s refer to our Egg example above.

Non-Organic, Organic, and Free-Range Organic Eggs would be assigned quantitative values (1,2,3). They would serve as our independent treatment variable, while the price per dozen eggs would serve as the dependent variable. Other erroneous variables may include “Brand Name” or “Laid Egg Date.”

Using data and the aov() command in R, we could then determine the impact Egg Type has on the price per dozen eggs.

ANOVA vs. t-test?

T-tests and ANOVA tests are both statistical techniques used to compare differences in means and spreads of the distributions across populations.

The t-test determines whether two populations are statistically different from each other, whereas ANOVA tests are used when an individual wants to test more than two levels within an independent variable.

Referring back to our egg example, testing Non-Organic vs. Organic would require a t-test while adding in Free Range as a third option demands ANOVA.

Rather than generate a t-statistic, ANOVA results in an f-statistic to determine statistical significance.

What does anova stand for?

ANOVA stands for Analysis of Variance. It’s a statistical method to analyze differences among group means in a sample. ANOVA tests the hypothesis that the means of two or more populations are equal, generalizing the t-test to more than two groups.

It’s commonly used in experiments where various factors’ effects are compared. It can also handle complex experiments with factors that have different numbers of levels.

When to use anova?

ANOVA should be used when one independent variable has three or more levels (categories or groups). It’s designed to compare the means of these multiple groups.

What does an anova test tell you?

An ANOVA test tells you if there are significant differences between the means of three or more groups. If the test result is significant, it suggests that at least one group’s mean differs from the others. It does not, however, specify which groups are different from each other.

Why do you use chi-square instead of ANOVA?

You use the chi-square test instead of ANOVA when dealing with categorical data to test associations or independence between two categorical variables. In contrast, ANOVA is used for continuous data to compare the means of three or more groups.

Print Friendly, PDF & Email

Related Articles

Exploratory Data Analysis

Exploratory Data Analysis

What Is Face Validity In Research? Importance & How To Measure

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

Criterion Validity: Definition & Examples

Criterion Validity: Definition & Examples

Convergent Validity: Definition and Examples

Convergent Validity: Definition and Examples

Content Validity in Research: Definition & Examples

Content Validity in Research: Definition & Examples

Construct Validity In Psychology Research

Construct Validity In Psychology Research

Hypothesis Testing - Analysis of Variance (ANOVA)

Lisa Sullivan, PhD

Professor of Biostatistics

Boston University School of Public Health

example anova research question

Introduction

This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. The hypothesis is based on available information and the investigator's belief about the population parameters. The specific test considered here is called analysis of variance (ANOVA) and is a test of hypothesis that is appropriate to compare means of a continuous variable in two or more independent comparison groups. For example, in some clinical trials there are more than two comparison groups. In a clinical trial to evaluate a new medication for asthma, investigators might compare an experimental medication to a placebo and to a standard treatment (i.e., a medication currently being used). In an observational study such as the Framingham Heart Study, it might be of interest to compare mean blood pressure or mean cholesterol levels in persons who are underweight, normal weight, overweight and obese.  

The technique to test for a difference in more than two independent means is an extension of the two independent samples procedure discussed previously which applies when there are exactly two independent comparison groups. The ANOVA technique applies when there are two or more than two independent groups. The ANOVA procedure is used to compare the means of the comparison groups and is conducted using the same five step approach used in the scenarios discussed in previous sections. Because there are more than two groups, however, the computation of the test statistic is more involved. The test statistic must take into account the sample sizes, sample means and sample standard deviations in each of the comparison groups.

If one is examining the means observed among, say three groups, it might be tempting to perform three separate group to group comparisons, but this approach is incorrect because each of these comparisons fails to take into account the total data, and it increases the likelihood of incorrectly concluding that there are statistically significate differences, since each comparison adds to the probability of a type I error. Analysis of variance avoids these problemss by asking a more global question, i.e., whether there are significant differences among the groups, without addressing differences between any two groups in particular (although there are additional tests that can do this if the analysis of variance indicates that there are differences among the groups).

The fundamental strategy of ANOVA is to systematically examine variability within groups being compared and also examine variability among the groups being compared.

Learning Objectives

After completing this module, the student will be able to:

  • Perform analysis of variance by hand
  • Appropriately interpret results of analysis of variance tests
  • Distinguish between one and two factor analysis of variance tests
  • Identify the appropriate hypothesis testing procedure based on type of outcome variable and number of samples

The ANOVA Approach

Consider an example with four independent groups and a continuous outcome measure. The independent groups might be defined by a particular characteristic of the participants such as BMI (e.g., underweight, normal weight, overweight, obese) or by the investigator (e.g., randomizing participants to one of four competing treatments, call them A, B, C and D). Suppose that the outcome is systolic blood pressure, and we wish to test whether there is a statistically significant difference in mean systolic blood pressures among the four groups. The sample data are organized as follows:

 

n

n

n

n

s

s

s

s

The hypotheses of interest in an ANOVA are as follows:

  • H 0 : μ 1 = μ 2 = μ 3 ... = μ k
  • H 1 : Means are not all equal.

where k = the number of independent comparison groups.

In this example, the hypotheses are:

  • H 0 : μ 1 = μ 2 = μ 3 = μ 4
  • H 1 : The means are not all equal.

The null hypothesis in ANOVA is always that there is no difference in means. The research or alternative hypothesis is always that the means are not all equal and is usually written in words rather than in mathematical symbols. The research hypothesis captures any difference in means and includes, for example, the situation where all four means are unequal, where one is different from the other three, where two are different, and so on. The alternative hypothesis, as shown above, capture all possible situations other than equality of all means specified in the null hypothesis.

Test Statistic for ANOVA

The test statistic for testing H 0 : μ 1 = μ 2 = ... =   μ k is:

and the critical value is found in a table of probability values for the F distribution with (degrees of freedom) df 1 = k-1, df 2 =N-k. The table can be found in "Other Resources" on the left side of the pages.

NOTE: The test statistic F assumes equal variability in the k populations (i.e., the population variances are equal, or s 1 2 = s 2 2 = ... = s k 2 ). This means that the outcome is equally variable in each of the comparison populations. This assumption is the same as that assumed for appropriate use of the test statistic to test equality of two independent means. It is possible to assess the likelihood that the assumption of equal variances is true and the test can be conducted in most statistical computing packages. If the variability in the k comparison groups is not similar, then alternative techniques must be used.

The F statistic is computed by taking the ratio of what is called the "between treatment" variability to the "residual or error" variability. This is where the name of the procedure originates. In analysis of variance we are testing for a difference in means (H 0 : means are all equal versus H 1 : means are not all equal) by evaluating variability in the data. The numerator captures between treatment variability (i.e., differences among the sample means) and the denominator contains an estimate of the variability in the outcome. The test statistic is a measure that allows us to assess whether the differences among the sample means (numerator) are more than would be expected by chance if the null hypothesis is true. Recall in the two independent sample test, the test statistic was computed by taking the ratio of the difference in sample means (numerator) to the variability in the outcome (estimated by Sp).  

The decision rule for the F test in ANOVA is set up in a similar way to decision rules we established for t tests. The decision rule again depends on the level of significance and the degrees of freedom. The F statistic has two degrees of freedom. These are denoted df 1 and df 2 , and called the numerator and denominator degrees of freedom, respectively. The degrees of freedom are defined as follows:

df 1 = k-1 and df 2 =N-k,

where k is the number of comparison groups and N is the total number of observations in the analysis.   If the null hypothesis is true, the between treatment variation (numerator) will not exceed the residual or error variation (denominator) and the F statistic will small. If the null hypothesis is false, then the F statistic will be large. The rejection region for the F test is always in the upper (right-hand) tail of the distribution as shown below.

Rejection Region for F   Test with a =0.05, df 1 =3 and df 2 =36 (k=4, N=40)

Graph of rejection region for the F statistic with alpha=0.05

For the scenario depicted here, the decision rule is: Reject H 0 if F > 2.87.

The ANOVA Procedure

We will next illustrate the ANOVA procedure using the five step approach. Because the computation of the test statistic is involved, the computations are often organized in an ANOVA table. The ANOVA table breaks down the components of variation in the data into variation between treatments and error or residual variation. Statistical computing packages also produce ANOVA tables as part of their standard output for ANOVA, and the ANOVA table is set up as follows: 

Source of Variation

Sums of Squares (SS)

Degrees of Freedom (df)

Mean Squares (MS)

F

Between Treatments

k-1

Error (or Residual)

N-k

Total

N-1

where  

  • X = individual observation,
  • k = the number of treatments or independent comparison groups, and
  • N = total number of observations or total sample size.

The ANOVA table above is organized as follows.

  • The first column is entitled "Source of Variation" and delineates the between treatment and error or residual variation. The total variation is the sum of the between treatment and error variation.
  • The second column is entitled "Sums of Squares (SS)" . The between treatment sums of squares is

and is computed by summing the squared differences between each treatment (or group) mean and the overall mean. The squared differences are weighted by the sample sizes per group (n j ). The error sums of squares is:

and is computed by summing the squared differences between each observation and its group mean (i.e., the squared differences between each observation in group 1 and the group 1 mean, the squared differences between each observation in group 2 and the group 2 mean, and so on). The double summation ( SS ) indicates summation of the squared differences within each treatment and then summation of these totals across treatments to produce a single value. (This will be illustrated in the following examples). The total sums of squares is:

and is computed by summing the squared differences between each observation and the overall sample mean. In an ANOVA, data are organized by comparison or treatment groups. If all of the data were pooled into a single sample, SST would reflect the numerator of the sample variance computed on the pooled or total sample. SST does not figure into the F statistic directly. However, SST = SSB + SSE, thus if two sums of squares are known, the third can be computed from the other two.

  • The third column contains degrees of freedom . The between treatment degrees of freedom is df 1 = k-1. The error degrees of freedom is df 2 = N - k. The total degrees of freedom is N-1 (and it is also true that (k-1) + (N-k) = N-1).
  • The fourth column contains "Mean Squares (MS)" which are computed by dividing sums of squares (SS) by degrees of freedom (df), row by row. Specifically, MSB=SSB/(k-1) and MSE=SSE/(N-k). Dividing SST/(N-1) produces the variance of the total sample. The F statistic is in the rightmost column of the ANOVA table and is computed by taking the ratio of MSB/MSE.  

A clinical trial is run to compare weight loss programs and participants are randomly assigned to one of the comparison programs and are counseled on the details of the assigned program. Participants follow the assigned program for 8 weeks. The outcome of interest is weight loss, defined as the difference in weight measured at the start of the study (baseline) and weight measured at the end of the study (8 weeks), measured in pounds.  

Three popular weight loss programs are considered. The first is a low calorie diet. The second is a low fat diet and the third is a low carbohydrate diet. For comparison purposes, a fourth group is considered as a control group. Participants in the fourth group are told that they are participating in a study of healthy behaviors with weight loss only one component of interest. The control group is included here to assess the placebo effect (i.e., weight loss due to simply participating in the study). A total of twenty patients agree to participate in the study and are randomly assigned to one of the four diet groups. Weights are measured at baseline and patients are counseled on the proper implementation of the assigned diet (with the exception of the control group). After 8 weeks, each patient's weight is again measured and the difference in weights is computed by subtracting the 8 week weight from the baseline weight. Positive differences indicate weight losses and negative differences indicate weight gains. For interpretation purposes, we refer to the differences in weights as weight losses and the observed weight losses are shown below.

Low Calorie

Low Fat

Low Carbohydrate

Control

8

2

3

2

9

4

5

2

6

3

4

-1

7

5

2

0

3

1

3

3

Is there a statistically significant difference in the mean weight loss among the four diets?  We will run the ANOVA using the five-step approach.

  • Step 1. Set up hypotheses and determine level of significance

H 0 : μ 1 = μ 2 = μ 3 = μ 4 H 1 : Means are not all equal              α=0.05

  • Step 2. Select the appropriate test statistic.  

The test statistic is the F statistic for ANOVA, F=MSB/MSE.

  • Step 3. Set up decision rule.  

The appropriate critical value can be found in a table of probabilities for the F distribution(see "Other Resources"). In order to determine the critical value of F we need degrees of freedom, df 1 =k-1 and df 2 =N-k. In this example, df 1 =k-1=4-1=3 and df 2 =N-k=20-4=16. The critical value is 3.24 and the decision rule is as follows: Reject H 0 if F > 3.24.

  • Step 4. Compute the test statistic.  

To organize our computations we complete the ANOVA table. In order to compute the sums of squares we must first compute the sample means for each group and the overall mean based on the total sample.  

 

Low Calorie

Low Fat

Low Carbohydrate

Control

n

5

5

5

5

Group mean

6.6

3.0

3.4

1.2

We can now compute

So, in this case:

Next we compute,

SSE requires computing the squared differences between each observation and its group mean. We will compute SSE in parts. For the participants in the low calorie diet:  

6.6

8

1.4

2.0

9

2.4

5.8

6

-0.6

0.4

7

0.4

0.2

3

-3.6

13.0

Totals

0

21.4

For the participants in the low fat diet:  

3.0

2

-1.0

1.0

4

1.0

1.0

3

0.0

0.0

5

2.0

4.0

1

-2.0

4.0

Totals

0

10.0

For the participants in the low carbohydrate diet:  

3

-0.4

0.2

5

1.6

2.6

4

0.6

0.4

2

-1.4

2.0

3

-0.4

0.2

Totals

0

5.4

For the participants in the control group:

2

0.8

0.6

2

0.8

0.6

-1

-2.2

4.8

0

-1.2

1.4

3

1.8

3.2

Totals

0

10.6

We can now construct the ANOVA table .

Source of Variation

Sums of Squares

(SS)

Degrees of Freedom

(df)

Means Squares

(MS)

F

Between Treatmenst

75.8

4-1=3

75.8/3=25.3

25.3/3.0=8.43

Error (or Residual)

47.4

20-4=16

47.4/16=3.0

Total

123.2

20-1=19

  • Step 5. Conclusion.  

We reject H 0 because 8.43 > 3.24. We have statistically significant evidence at α=0.05 to show that there is a difference in mean weight loss among the four diets.    

ANOVA is a test that provides a global assessment of a statistical difference in more than two independent means. In this example, we find that there is a statistically significant difference in mean weight loss among the four diets considered. In addition to reporting the results of the statistical test of hypothesis (i.e., that there is a statistically significant difference in mean weight losses at α=0.05), investigators should also report the observed sample means to facilitate interpretation of the results. In this example, participants in the low calorie diet lost an average of 6.6 pounds over 8 weeks, as compared to 3.0 and 3.4 pounds in the low fat and low carbohydrate groups, respectively. Participants in the control group lost an average of 1.2 pounds which could be called the placebo effect because these participants were not participating in an active arm of the trial specifically targeted for weight loss. Are the observed weight losses clinically meaningful?

Another ANOVA Example

Calcium is an essential mineral that regulates the heart, is important for blood clotting and for building healthy bones. The National Osteoporosis Foundation recommends a daily calcium intake of 1000-1200 mg/day for adult men and women. While calcium is contained in some foods, most adults do not get enough calcium in their diets and take supplements. Unfortunately some of the supplements have side effects such as gastric distress, making them difficult for some patients to take on a regular basis.  

 A study is designed to test whether there is a difference in mean daily calcium intake in adults with normal bone density, adults with osteopenia (a low bone density which may lead to osteoporosis) and adults with osteoporosis. Adults 60 years of age with normal bone density, osteopenia and osteoporosis are selected at random from hospital records and invited to participate in the study. Each participant's daily calcium intake is measured based on reported food intake and supplements. The data are shown below.   

1200

1000

890

1000

1100

650

980

700

1100

900

800

900

750

500

400

800

700

350

Is there a statistically significant difference in mean calcium intake in patients with normal bone density as compared to patients with osteopenia and osteoporosis? We will run the ANOVA using the five-step approach.

H 0 : μ 1 = μ 2 = μ 3 H 1 : Means are not all equal                            α=0.05

In order to determine the critical value of F we need degrees of freedom, df 1 =k-1 and df 2 =N-k.   In this example, df 1 =k-1=3-1=2 and df 2 =N-k=18-3=15. The critical value is 3.68 and the decision rule is as follows: Reject H 0 if F > 3.68.

To organize our computations we will complete the ANOVA table. In order to compute the sums of squares we must first compute the sample means for each group and the overall mean.  

Normal Bone Density

n =6

n =6

n =6

 If we pool all N=18 observations, the overall mean is 817.8.

We can now compute:

Substituting:

SSE requires computing the squared differences between each observation and its group mean. We will compute SSE in parts. For the participants with normal bone density:

1200

261.6667

68,486.9

1000

61.6667

3,806.9

980

41.6667

1,738.9

900

-38.3333

1,466.9

750

-188.333

35,456.9

800

-138.333

19,126.9

Total

0

130,083.3

For participants with osteopenia:

1000

200

40,000

1100

300

90,000

700

-100

10,000

800

0

0

500

-300

90,000

700

-100

10,000

Total

0

240,000

For participants with osteoporosis:

890

175

30,625

650

-65

4,225

1100

385

148,225

900

185

34,225

400

-315

99,225

350

-365

133,225

Total

0

449,750

Between Treatments

152,477.7

2

76,238.6

1.395

Error or Residual

819,833.3

15

54,655.5

Total

972,311.0

17

We do not reject H 0 because 1.395 < 3.68. We do not have statistically significant evidence at a =0.05 to show that there is a difference in mean calcium intake in patients with normal bone density as compared to osteopenia and osterporosis. Are the differences in mean calcium intake clinically meaningful? If so, what might account for the lack of statistical significance?

One-Way ANOVA in R

The video below by Mike Marin demonstrates how to perform analysis of variance in R. It also covers some other statistical issues, but the initial part of the video will be useful to you.

Two-Factor ANOVA

The ANOVA tests described above are called one-factor ANOVAs. There is one treatment or grouping factor with k > 2 levels and we wish to compare the means across the different categories of this factor. The factor might represent different diets, different classifications of risk for disease (e.g., osteoporosis), different medical treatments, different age groups, or different racial/ethnic groups. There are situations where it may be of interest to compare means of a continuous outcome across two or more factors. For example, suppose a clinical trial is designed to compare five different treatments for joint pain in patients with osteoarthritis. Investigators might also hypothesize that there are differences in the outcome by sex. This is an example of a two-factor ANOVA where the factors are treatment (with 5 levels) and sex (with 2 levels). In the two-factor ANOVA, investigators can assess whether there are differences in means due to the treatment, by sex or whether there is a difference in outcomes by the combination or interaction of treatment and sex. Higher order ANOVAs are conducted in the same way as one-factor ANOVAs presented here and the computations are again organized in ANOVA tables with more rows to distinguish the different sources of variation (e.g., between treatments, between men and women). The following example illustrates the approach.

Consider the clinical trial outlined above in which three competing treatments for joint pain are compared in terms of their mean time to pain relief in patients with osteoarthritis. Because investigators hypothesize that there may be a difference in time to pain relief in men versus women, they randomly assign 15 participating men to one of the three competing treatments and randomly assign 15 participating women to one of the three competing treatments (i.e., stratified randomization). Participating men and women do not know to which treatment they are assigned. They are instructed to take the assigned medication when they experience joint pain and to record the time, in minutes, until the pain subsides. The data (times to pain relief) are shown below and are organized by the assigned treatment and sex of the participant.

Table of Time to Pain Relief by Treatment and Sex

12

21

15

19

16

18

17

24

14

25

14

21

17

20

19

23

20

27

17

25

25

37

27

34

29

36

24

26

22

29

The analysis in two-factor ANOVA is similar to that illustrated above for one-factor ANOVA. The computations are again organized in an ANOVA table, but the total variation is partitioned into that due to the main effect of treatment, the main effect of sex and the interaction effect. The results of the analysis are shown below (and were generated with a statistical computing package - here we focus on interpretation). 

 ANOVA Table for Two-Factor ANOVA

Model

967.0

5

193.4

20.7

0.0001

Treatment

651.5

2

325.7

34.8

0.0001

Sex

313.6

1

313.6

33.5

0.0001

Treatment * Sex

1.9

2

0.9

0.1

0.9054

Error or Residual

224.4

24

9.4

Total

1191.4

29

There are 4 statistical tests in the ANOVA table above. The first test is an overall test to assess whether there is a difference among the 6 cell means (cells are defined by treatment and sex). The F statistic is 20.7 and is highly statistically significant with p=0.0001. When the overall test is significant, focus then turns to the factors that may be driving the significance (in this example, treatment, sex or the interaction between the two). The next three statistical tests assess the significance of the main effect of treatment, the main effect of sex and the interaction effect. In this example, there is a highly significant main effect of treatment (p=0.0001) and a highly significant main effect of sex (p=0.0001). The interaction between the two does not reach statistical significance (p=0.91). The table below contains the mean times to pain relief in each of the treatments for men and women (Note that each sample mean is computed on the 5 observations measured under that experimental condition).  

Mean Time to Pain Relief by Treatment and Gender

A

14.8

21.4

B

17.4

23.2

C

25.4

32.4

Treatment A appears to be the most efficacious treatment for both men and women. The mean times to relief are lower in Treatment A for both men and women and highest in Treatment C for both men and women. Across all treatments, women report longer times to pain relief (See below).  

Graph of two-factor ANOVA

Notice that there is the same pattern of time to pain relief across treatments in both men and women (treatment effect). There is also a sex effect - specifically, time to pain relief is longer in women in every treatment.  

Suppose that the same clinical trial is replicated in a second clinical site and the following data are observed.

Table - Time to Pain Relief by Treatment and Sex - Clinical Site 2

22

21

25

19

26

18

27

24

24

25

14

21

17

20

19

23

20

27

17

25

15

37

17

34

19

36

14

26

12

29

The ANOVA table for the data measured in clinical site 2 is shown below.

Table - Summary of Two-Factor ANOVA - Clinical Site 2

Source of Variation

Sums of Squares

(SS)

Degrees of freedom

(df)

Mean Squares

(MS)

F

P-Value

Model

907.0

5

181.4

19.4

0.0001

Treatment

71.5

2

35.7

3.8

0.0362

Sex

313.6

1

313.6

33.5

0.0001

Treatment * Sex

521.9

2

260.9

27.9

0.0001

Error or Residual

224.4

24

9.4

Total

1131.4

29

Notice that the overall test is significant (F=19.4, p=0.0001), there is a significant treatment effect, sex effect and a highly significant interaction effect. The table below contains the mean times to relief in each of the treatments for men and women.  

Table - Mean Time to Pain Relief by Treatment and Gender - Clinical Site 2

24.8

21.4

17.4

23.2

15.4

32.4

Notice that now the differences in mean time to pain relief among the treatments depend on sex. Among men, the mean time to pain relief is highest in Treatment A and lowest in Treatment C. Among women, the reverse is true. This is an interaction effect (see below).  

Graphic display of the results in the preceding table

Notice above that the treatment effect varies depending on sex. Thus, we cannot summarize an overall treatment effect (in men, treatment C is best, in women, treatment A is best).    

When interaction effects are present, some investigators do not examine main effects (i.e., do not test for treatment effect because the effect of treatment depends on sex). This issue is complex and is discussed in more detail in a later module. 

Conduct and Interpret a One-Way ANOVA

What is the One-Way ANOVA?

One-Way ANOVA, standing for Analysis Of Variance, is a statistical method used to determine if there are significant differences between the averages of two or more unrelated groups. This technique is particularly useful when you want to compare the effect of one single factor (independent variable) across different groups on a specific outcome (dependent variable).

Beyond Basic Comparison

While ANOVA is primarily used to compare differences, it often goes a step further by exploring cause-and-effect relationships. It suggests that the differences observed among groups are due to one or more controlled factors. Essentially, these factors categorize the data points into groups, leading to variations in the average outcomes of these groups.

Example Simplified

Imagine we’re curious about whether there’s a difference in hair length between genders. We gather a group of twenty undergraduate students, half identified as female and half as male, and measure their hair length.

  • Conservative Approach: A cautious statistician might say, “After measuring the hair length of ten female and ten male students, our analysis shows that, on average, female students have significantly longer hair than male students.”
  • Assertive Approach: A more assertive statistician might interpret the results to mean that gender directly influences hair length, suggesting a cause-and-effect relationship.

Understanding ANOVA’s Role

Most statisticians lean towards the assertive approach, viewing ANOVA as a tool for analyzing dependencies. This perspective sees ANOVA as not just comparing averages but testing the influence of one or more factors on an outcome. In statistical language, it’s about examining how independent variables (like gender in our example) affect dependent variables (such as hair length), assuming a functional relationship (Y = f(x1, x2, x3, … xn)).

In essence, One-Way ANOVA is a powerful method for not only identifying significant differences between groups but also for hinting at potential underlying causes for these differences. It’s a foundational tool in the statistical analysis of data, enabling researchers to draw meaningful conclusions about the effects of various factors on specific outcomes.

ANOVA help?

Option 1: User-friendly Software

Transform raw data to written interpreted results in seconds.

Option 2: Professional Statistician

Collaborate with a statistician to complete and understand your results.

The ANOVA is a popular test; it is the test to use when conducting experiments.  This is due to the fact that it only requires a nominal scale for the independent variables – other multivariate tests (e.g., regression analysis) require a continuous-level scale.  This following table shows the required scales for some selected tests.

 Independent Variable
MetricNon-metric
DependentVariablemetricRegression
Non-metricDiscriminant Analysisχ²
(Chi-Square)

The F-test, the T-test, and the MANOVA are all similar to the ANOVA.  The F-test is another name for an ANOVA that only compares the statistical means in two groups.  This happens if the independent variable for the ANOVA has only two factor steps, for example male or female as a gender.

The T-test compares the means of two (and only two) groups when the variances are not equal.  The equality of variances (also called homoscedasticity or homogeneity) is one of the main assumptions of the ANOVA (see assumptions, Levene Test, Bartlett Test).  MANOVA stands for M ultivariate An alysis o f Va riance.  Whereas the ANOVA can have one or more independent variables, it always has only one dependent variable.  On the other hand the MANOVA can have two or more dependent variables.

Examples for typical questions the ANOVA answers are as follows:

  • Medicine – Does a drug work? Does the average life expectancy significantly differ between the three groups that received the drug versus the established product versus the control?
  • Sociology – Are rich people happier? Do different income classes report a significantly different satisfaction with life?
  • Management Studies – What makes a company more profitable? A one, three or five-year strategy cycle?

The One-Way ANOVA in SPSS

Let’s consider our research question from the Education studies example.  Do the standardized math test scores differ between students that passed the exam and students that failed the final exam? This question indicates that our independent variable is the exam result (fail vs.  pass) and our dependent variable is the score from the math test.  We must now check the assumptions.

First we examine the multivariate normality of the dependent variable.  We can check graphically either with a histogram ( Analyze/Descriptive Statistics/Frequencies… and then in the menu Charts…) or with a Q-Q-Plot ( Analyze/Descriptive Statistics/Q-Q-Plot… ).  Both plots show a somewhat normal distribution, with a skew around the mean.

example anova research question

Secondly, we can test for multivariate normality with the Kolmogorov-Smirnov goodness of fit test ( Analyze/Nonparacontinuous-level Test/Legacy Dialogs/1 Sample K S… ).  An alternative to the K-S test is the Chi-Square goodness of fit test, but the K-S test is more robust for continuous-level variables.

The K-S test is not significant (p = 0.075) thus we cannot reject the null hypothesis that the sample distribution is multivariate normal.  The K-S test is one of the few tests where a non-significant result (p > 0.05) is the desired outcome.

If normality is not present, we could exclude the outliers to fix the problem, center the variable by deducting the mean, or apply a non-linear transformation to the variable creating an index.

The ANOVA can be found in SPSS in Analyze/Compare Means/One Way ANOVA .

example anova research question

In the ANOVA dialog we need to specify our model.  As described in the research question we want to test, the math test score is our dependent variable and the exam result is our independent variable.  This would be enough for a basic analysis.  But the dialog box has a couple more options around Contrasts, post hoc tests (also called multiple comparisons), and Options.

example anova research question

In the dialog box options we can specify additional statistics.  If you find it useful you might include standard descriptive statistics.  Generally you should select the Homogeneity of variance test (which is the Levene test of homoscedasticity), because as we find in our decision tree the outcome of this test is the criterion that decides between the t-test and the ANOVA.

example anova research question

Post Hoc Tests

Post Hoc tests are useful if your independent variable includes more than two groups.  In our example the independent variable just specifies the outcome of the final exam on two factor levels – pass or fail.  If more than two factor levels are given it might be useful to run pairwise tests to test which differences between groups are significant.  Because executing several pairwise tests in one analysis decreases the degrees of freedom, the Bonferoni adjustment should be selected, which corrects for multiple pairwise comparisons.  Another test method commonly employed is the Student-Newman-Keuls test (or short S-N-K), which pools the groups that do not differ significantly from each other.  Therefore this improves the reliability of the post hoc comparison because it increases the sample size used in the comparison.

example anova research question

The last dialog box is contrasts.  Contrasts are differences in mean scores.  It allows you to group multiple groups into one and test the average mean of the two groups against our third group.  Please note that the contrast is not always the mean of the pooled groups! Contrast = (mean first group + mean second group)/2.  It is only equal to the pooled mean, if the groups are of equal size.  It is also possible to specify weights for the contrasts, e.g., 0.7 for group 1 and 0.3 for group 2.  We do not specify contrasts for this demonstration.

example anova research question

Need More Help?

Check out our online course for conducting an ANOVA here .

Statistics Solutions can assist with your quantitative analysis by assisting you to develop your methodology and results chapters. The services that we offer include:

Data Analysis Plan

Edit your research questions and null/alternative hypotheses

Write your data analysis plan; specify specific statistics to address the research questions, the assumptions of the statistics, and justify why they are the appropriate statistics; provide references

Justify your sample size/power analysis, provide references

Explain your data analysis plan to you so you are comfortable and confident

Two hours of additional support with your statistician

Quantitative Results Section (Descriptive Statistics, Bivariate and Multivariate Analyses, Structural Equation Modeling , Path analysis, HLM, Cluster Analysis )

Clean and code dataset

Conduct descriptive statistics (i.e., mean, standard deviation, frequency and percent, as appropriate)

Conduct analyses to examine each of your research questions

Write-up results

Provide APA 7 th edition tables and figures

Explain Chapter 4 findings

Ongoing support for entire results chapter statistics

Please call 727-442-4290 to request a quote based on the specifics of your research, schedule using the calendar on this page, or email [email protected]

7   ANOVA

Matthew J. C. Crump

A fun bit of stats history ( Salsburg 2001 ) . Sir Ronald Fisher invented the ANOVA, which we learn about in this section. He wanted to publish his new test in the journal Biometrika. The editor at the time was Karl Pearson (remember Pearson’s \(r\) for correlation?). Pearson and Fisher were apparently not on good terms, they didn’t like each other. Pearson refused to publish Fisher’s new test. So, Fisher eventually published his work in the Journal of Agricultural Science. Funnily enough, the feud continued onto the next generation. Years after Fisher published his ANOVA, Karl Pearson’s son Egon Pearson, and Jersey Neyman revamped Fisher’s ideas, and re-cast them into what is commonly known as null vs. alternative hypothesis testing. Fisher didn’t like this very much.

We present the ANOVA in the Fisherian sense, and at the end describe the Neyman-Pearson approach that invokes the concept of null vs. alternative hypotheses.

7.1 ANOVA is Analysis of Variance

ANOVA stands for Analysis Of Variance. It is a widely used technique for assessing the likelihood that differences found between means in sample data could be produced by chance. You might be thinking, well don’t we have \(t\) -tests for that? Why do we need the ANOVA, what do we get that’s new that we didn’t have before?

What’s new with the ANOVA, is the ability to test a wider range of means beyond just two. In all of the \(t\) -test examples we were always comparing two things. For example, we might ask whether the difference between two sample means could have been produced by chance. What if our experiment had more than two conditions or groups? We would have more than 2 means. We would have one mean for each group or condition. That could be a lot depending on the experiment. How would we compare all of those means? What should we do, run a lot of \(t\) -tests, comparing every possible combination of means? Actually, you could do that. Or, you could do an ANOVA.

In practice, we will combine both the ANOVA test and \(t\) -tests when analyzing data with many sample means (from more than two groups or conditions). Just like the \(t\) -test, there are different kinds of ANOVAs for different research designs. There is one for between-subjects designs, and a slightly different one for repeated measures designs. We talk about both, beginning with the ANOVA for between-subjects designs.

7.2 One-factor ANOVA

The one-factor ANOVA is sometimes also called a between-subjects ANOVA, an independent factor ANOVA, or a one-way ANOVA (which is a bit of a misnomer as we discuss later). The critical ingredient for a one-factor, between-subjects ANOVA, is that you have one independent variable, with at least two-levels. When you have one IV with two levels, you can run a \(t\) -test. You can also run an ANOVA. Interestingly, they give you almost the exact same results. You will get a \(p\) -value from both tests that is identical (they are really doing the same thing under the hood). The \(t\) -test gives a \(t\) -value as the important sample statistic. The ANOVA gives you the \(F\) -value (for Fisher, the inventor of the test) as the important sample statistic. It turns out that \(t^2\) equals \(F\) , when there are only two groups in the design. They are the same test. Side-note, it turns out they are all related to Pearson’s r too (but we haven’t written about this relationship yet in this textbook).

Remember that \(t\) is computed directly from the data. It’s like a mean and standard error that we measure from the sample. In fact it’s the mean difference divided by the standard error of the sample. It’s just another descriptive statistic isn’t it.

The same thing is true about \(F\) . \(F\) is computed directly from the data. In fact, the idea behind \(F\) is the same basic idea that goes into making \(t\) . Here is the general idea behind the formula, it is again a ratio of the effect we are measuring (in the numerator), and the variation associated with the effect (in the denominator).

\(\text{name of statistic} = \frac{\text{measure of effect}}{\text{measure of error}}\)

\(\text{F} = \frac{\text{measure of effect}}{\text{measure of error}}\)

The difference with \(F\) , is that we use variances to describe both the measure of the effect and the measure of error. So, \(F\) is a ratio of two variances.

Remember what we said about how these ratios work. When the variance associated with the effect is the same size as the variance associated with sampling error, we will get two of the same numbers, this will result in an \(F\) -value of 1. When the variance due to the effect is larger than the variance associated with sampling error, then \(F\) will be greater than 1. When the variance associated with the effect is smaller than the variance associated with sampling error, \(F\) will be less than one.

Let’s rewrite in plainer English. We are talking about two concepts that we would like to measure from our data. 1) A measure of what we can explain, and 2) a measure of error, or stuff about our data we can’t explain. So, the \(F\) formula looks like this:

\(\text{F} = \frac{\text{Can Explain}}{\text{Can't Explain}}\)

When we can explain as much as we can’t explain, \(F\) = 1. This isn’t that great of a situation for us to be in. It means we have a lot of uncertainty. When we can explain much more than we can’t we are doing a good job, \(F\) will be greater than 1. When we can explain less than what we can’t, we really can’t explain very much, \(F\) will be less than 1. That’s the concept behind making \(F\) .

If you saw an \(F\) in the wild, and it was .6. Then you would automatically know the researchers couldn’t explain much of their data. If you saw an \(F\) of 5, then you would know the researchers could explain 5 times more than the couldn’t, that’s pretty good. And the point of this is to give you an intuition about the meaning of an \(F\) -value, even before you know how to compute it.

7.2.1 Computing the \(F\) -value

Fisher’s ANOVA is very elegant in my opinion. It starts us off with a big problem we always have with data. We have a lot of numbers, and there is a lot of variation in the numbers, what to do? Wouldn’t it be nice to split up the variation into to kinds, or sources. If we could know what parts of the variation were being caused by our experimental manipulation, and what parts were being caused by sampling error, we would be making really good progress. We would be able to know if our experimental manipulation was causing more change in the data than sampling error, or chance alone. If we could measure those two parts of the total variation, we could make a ratio, and then we would have an \(F\) value. This is what the ANOVA does. It splits the total variation in the data into two parts. The formula is:

Total Variation = Variation due to Manipulation + Variation due to sampling error

This is a nice idea, but it is also vague. We haven’t specified our measure of variation. What should we use?

Remember the sums of squares that we used to make the variance and the standard deviation? That’s what we’ll use. Let’s take another look at the formula, using sums of squares for the measure of variation:

\(SS_\text{total} = SS_\text{Effect} + SS_\text{Error}\)

7.2.2 SS Total

The total sums of squares, or \(SS\text{Total}\) is a way of thinking about all of the variation in a set of data. It’s pretty straightforward to measure. No tricky business. All we do is find the difference between each score and the grand mean, then we square the differences and add them all up.

Let’s imagine we had some data in three groups, A, B, and C. For example, we might have 3 scores in each group. The data could look like this:

groups scores diff diff_squared
A 20 13 169
A 11 4 16
A 2 -5 25
B 6 -1 1
B 2 -5 25
B 7 0 0
C 2 -5 25
C 11 4 16
C 2 -5 25
Sums 63 0 302
Means 7 0 33.5555555555556

The data is organized in long format, so that each row is a single score. There are three scores for the A, B, and C groups. The mean of all of the scores is called the Grand Mean . It’s calculated in the table, the Grand Mean = 7.

We also calculated all of the difference scores from the Grand Mean . The difference scores are in the column titled diff . Next, we squared the difference scores, and those are in the next column called diff_squared .

Remember, the difference scores are a way of measuring variation. They represent how far each number is from the Grand Mean. If the Grand Mean represents our best guess at summarizing the data, the difference scores represent the error between the guess and each actual data point. The only problem with the difference scores is that they sum to zero (because the mean is the balancing point in the data). So, it is convenient to square the difference scores, this turns all of them into positive numbers. The size of the squared difference scores still represents error between the mean and each score. And, the squaring operation exacerbates the differences as the error grows larger (squaring a big number makes a really big number, squaring a small number still makes a smallish number).

OK fine! We have the squared deviations from the grand mean, we know that they represent the error between the grand mean and each score. What next? SUM THEM UP!

When you add up all of the individual squared deviations (difference scores) you get the sums of squares. That’s why it’s called the sums of squares (SS).

Now, we have the first part of our answer:

\(SS_\text{total} = 302\) and

\(302 = SS_\text{Effect} + SS_\text{Error}\)

What next? If you think back to what you learned about algebra, and solving for X, you might notice that we don’t really need to find the answers to both missing parts of the equation. We only need one, and we can solve for the other. For example, if we found \(SS_\text{Effect}\) , then we could solve for \(SS_\text{Error}\) .

7.2.3 SS Effect

\(SS_\text{Total}\) gave us a number representing all of the change in our data, how all the scores are different from the grand mean.

What we want to do next is estimate how much of the total change in the data might be due to the experimental manipulation. For example, if we ran an experiment that causes causes change in the measurement, then the means for each group will be different from other. As a result, the manipulation forces change onto the numbers, and this will naturally mean that some part of the total variation in the numbers is caused by the manipulation.

The way to isolate the variation due to the manipulation (also called effect) is to look at the means in each group, and calculate the difference scores between each group mean and the grand mean, and then sum the squared deviations to find \(SS_\text{Effect}\) .

Consider this table, showing the calculations for \(SS_\text{Effect}\) .

groups scores means diff diff_squared
A 20 11 4 16
A 11 11 4 16
A 2 11 4 16
B 6 5 -2 4
B 2 5 -2 4
B 7 5 -2 4
C 2 5 -2 4
C 11 5 -2 4
C 2 5 -2 4
Sums 63 63 0 72
Means 7 7 0 8

Notice we created a new column called means . For example, the mean for group A was 11. You can see there are three 11s, one for each observation in row A. The means for group B and C happen to both be 5. So, the rest of the numbers in the means column are 5s.

What we are doing here is thinking of each score in the data from the viewpoint of the group means. The group means are our best attempt to summarize the data in those groups. From the point of view of the mean, all of the numbers are treated as the same. The mean doesn’t know how far off it is from each score, it just knows that all of the scores are centered on the mean.

Let’s pretend you are the mean for group A. That means you are an 11. Someone asks you “hey, what’s the score for the first data point in group A?”. Because you are the mean, you say, I know that, it’s 11. “What about the second score?”…it’s 11… they’re all 11, so far as I can tell…“Am I missing something…”, asked the mean.

Now that we have converted each score to it’s mean value we can find the differences between each mean score and the grand mean, then square them, then sum them up. We did that, and found that the \(SS_\text{Effect} = 72\) .

\(SS_\text{Effect}\) represents the amount of variation that is caused by differences between the means. I also refer to this as the amount of variation that the researcher can explain (by the means, which represent differences between groups or conditions that were manipulated by the researcher).

Notice also that \(SS_\text{Effect} = 72\) , and that 72 is smaller than \(SS_\text{total} = 302\) . That is very important. \(SS_\text{Effect}\) by definition can never be larger than \(SS_\text{total}\) .

7.2.4 SS Error

Great, we made it to SS Error. We already found SS Total, and SS Effect, so now we can solve for SS Error just like this:

switching around:

$ SS_ = SS_ - SS_ $

$ SS_ = 302 - 72 = 230 $

We could stop here and show you the rest of the ANOVA, we’re almost there. But, the next step might not make sense unless we show you how to calculate \(SS_\text{Error}\) directly from the data, rather than just solving for it. We should do this just to double-check our work anyway.

groups scores means diff diff_squared
A 20 11 -9 81
A 11 11 0 0
A 2 11 9 81
B 6 5 -1 1
B 2 5 3 9
B 7 5 -2 4
C 2 5 3 9
C 11 5 -6 36
C 2 5 3 9
Sums 63 63 0 230
Means 7 7 0 25.5555555555556

Alright, we did almost the same thing as we did to find \(SS_\text{Effect}\) . Can you spot the difference? This time for each score we first found the group mean, then we found the error in the group mean estimate for each score. In other words, the values in the \(diff\) column are the differences between each score and it’s group mean. The values in the diff_squared column are the squared deviations. When we sum up the squared deviations, we get another Sums of Squares, this time it’s the \(SS_\text{Error}\) . This is an appropriate name, because these deviations are the ones that the group means can’t explain!

7.2.5 Degrees of freedom

Degrees of freedom come into play again with ANOVA. This time, their purpose is a little bit more clear. \(Df\) s can be fairly simple when we are doing a relatively simple ANOVA like this one, but they can become complicated when designs get more complicated.

Let’s talk about the degrees of freedom for the \(SS_\text{Effect}\) and \(SS_\text{Error}\) .

The formula for the degrees of freedom for \(SS_\text{Effect}\) is

\(df_\text{Effect} = \text{Groups} -1\) , where Groups is the number of groups in the design.

In our example, there are 3 groups, so the df is 3-1 = 2. You can think of the df for the effect this way. When we estimate the grand mean (the overall mean), we are taking away a degree of freedom for the group means. Two of the group means can be anything they want (they have complete freedom), but in order for all three to be consistent with the Grand Mean, the last group mean has to be fixed.

The formula for the degrees of freedom for \(SS_\text{Error}\) is

\(df_\text{Error} = \text{scores} - \text{groups}\) , or the number of scores minus the number of groups. We have 9 scores and 3 groups, so our \(df\) for the error term is 9-3 = 6. Remember, when we computed the difference score between each score and its group mean, we had to compute three means (one for each group) to do that. So, that reduces the degrees of freedom by 3. 6 of the difference scores could be anything they want, but the last 3 have to be fixed to match the means from the groups.

7.2.6 Mean Squared Error

OK, so we have the degrees of freedom. What’s next? There are two steps left. First we divide the \(SS\) es by their respective degrees of freedom to create something new called Mean Squared Error. Let’s talk about why we do this.

First of all, remember we are trying to accomplish this goal:

We want to build a ratio that divides a measure of an effect by a measure of error. Perhaps you noticed that we already have a measure of an effect and error! How about the \(SS_\text{Effect}\) and \(SS_\text{Error}\) . They both represent the variation due to the effect, and the leftover variation that is unexplained. Why don’t we just do this?

\(\frac{SS_\text{Effect}}{SS_\text{Error}}\)

Well, of course you could do that. What would happen is you can get some really big and small numbers for your inferential statistic. And, the kind of number you would get wouldn’t be readily interpretable like a \(t\) value or a \(z\) score.

The solution is to normalize the \(SS\) terms. Don’t worry, normalize is just a fancy word for taking the average, or finding the mean. Remember, the SS terms are all sums. And, each sum represents a different number of underlying properties.

For example, the SS_ represents the sum of variation for three means in our study. We might ask the question, well, what is the average amount of variation for each mean…You might think to divide SS_ by 3, because there are three means, but because we are estimating this property, we divide by the degrees of freedom instead (# groups - 1 = 3-1 = 2). Now we have created something new, it’s called the \(MSE_\text{Effect}\) .

\(MSE_\text{Effect} = \frac{SS_\text{Effect}}{df_\text{Effect}}\)

\(MSE_\text{Effect} = \frac{72}{2} = 36\)

This might look alien and seem a bit complicated. But, it’s just another mean. It’s the mean of the sums of squares for the effect. If this reminds you of the formula for the variance, good memory. The \(SME_\text{Effect}\) is a measure variance for the change in the data due to changes in the means (which are tied to the experimental conditions).

The \(SS_\text{Error}\) represents the sum of variation for nine scores in our study. That’s a lot more scores, so the \(SS_\text{Error}\) is often way bigger than than \(SS_\text{Effect}\) . If we left our SSes this way and divided them, we would almost always get numbers less than one, because the \(SS_\text{Error}\) is so big. What we need to do is bring it down to the average size. So, we might want to divide our \(SS_\text{Error}\) by 9, after all there were nine scores. However, because we are estimating this property, we divide by the degrees of freedom instead (scores-groups) = 9-3 = 6). Now we have created something new, it’s called the \(MSE_\text{Error}\) .

\(MSE_\text{Error} = \frac{SS_\text{Error}}{df_\text{Error}}\)

\(MSE_\text{Error} = \frac{230}{6} = 38.33\)

7.2.7 Calculate F

Now that we have done all of the hard work, calculating \(F\) is easy:

\(\text{F} = \frac{MSE_\text{Effect}}{MSE_\text{Error}}\)

\(\text{F} = \frac{36}{38.33} = .939\)

7.2.8 The ANOVA TABLE

You might suspect we aren’t totally done here. We’ve walked through the steps of computing \(F\) . Remember, \(F\) is a sample statistic, we computed \(F\) directly from the data. There were a whole bunch of pieces we needed, the dfs, the SSes, the MSEs, and then finally the F.

All of these little pieces are conveniently organized by ANOVA tables. ANOVA tables look like this:

Df Sum Sq Mean Sq F value Pr(>F)
groups 2 72 36.00000 0.9391304 0.4417359
Residuals 6 230 38.33333 NA NA

You are looking at the print-out of an ANOVA summary table from R. Notice, it had columns for \(Df\) , \(SS\) (Sum Sq), \(MSE\) (Mean Sq), \(F\) , and a \(p\) -value. There are two rows. The groups row is for the Effect (what our means can explain). The Residuals row is for the Error (what our means can’t explain). Different programs give slightly different labels, but they are all attempting to present the same information in the ANOVA table. There isn’t anything special about the ANOVA table, it’s just a way of organizing all the pieces. Notice, the MSE for the effect (36) is placed above the MSE for the error (38.333), and this seems natural because we divide 36/38.33 in or to get the \(F\) -value!

7.3 What does F mean?

We’ve just noted that the ANOVA has a bunch of numbers that we calculated straight from the data. All except one, the \(p\) -value. We did not calculate the \(p\) -value from the data. Where did it come from, what does it mean? How do we use this for statistical inference. Just so you don’t get too worried, the \(p\) -value for the ANOVA has the very same general meaning as the \(p\) -value for the \(t\) -test, or the \(p\) -value for any sample statistic. It tells us that the probability that we would observe our test statistic or larger, under the distribution of no differences (the null).

As we keep saying, \(F\) is a sample statistic. Can you guess what we do with sample statistics in this textbook? We did it for the Crump Test, the Randomization Test, and the \(t\) -test… We make fake data, we simulate it, we compute the sample statistic we are interested in, then we see how it behaves over many replications or simulations.

Let’s do that for \(F\) . This will help you understand what \(F\) really is, and how it behaves. We are going to created the sampling distribution of \(F\) . Once we have that you will be able to see where the \(p\) -values come from. It’s the same basic process that we followed for the \(t\) tests, except we are measuring \(F\) instead of \(t\) .

Here is the set-up, we are going to run an experiment with three levels. In our imaginary experiment we are going to test whether a new magic pill can make you smarter. The independent variable is the number of magic pills you take: 1, 2, or 3. We will measure your smartness using a smartness test. We will assume the smartness test has some known properties, the mean score on the test is 100, with a standard deviation of 10 (and the distribution is normal).

The only catch is that our magic pill does NOTHING AT ALL. The fake people in our fake experiment will all take sugar pills that do absolutely nothing to their smartness. Why would we want to simulate such a bunch of nonsense? The answer is that this kind of simulation is critical for making inferences about chance if you were to conduct a real experiment.

Here are some more details for the experiment. Each group will have 10 different subjects, so there will be a total of 30 subjects. We are going to run this experiment 10,000 times. Each time drawing numbers randomly from the very same normal distribution. We are going to calculate \(F\) from our sample data every time, and then we are going to draw the histogram of \(F\) -values. Figure  7.1 shows the sampling distribution of \(F\) for our situation.

example anova research question

Let’s note a couple things about the \(F\) distribution. 1) The smallest value is 0, and there are no negative values. Does this make sense? \(F\) can never be negative because it is the ratio of two variances, and variances are always positive because of the squaring operation. So, yes, it makes sense that the sampling distribution of \(F\) is always 0 or greater. 2) it does not look normal. No it does not. \(F\) can have many different looking shapes, depending on the degrees of freedom in the numerator and denominator. However, these aspects are too important for now.

Remember, before we talked about some intuitive ideas for understanding \(F\) , based on the idea that \(F\) is a ratio of what we can explain (variance due to mean differences), divided by what we can’t explain (the error variance). When the error variance is higher than the effect variance, then we will always get an \(F\) -value less than one. You can see that we often got \(F\) -values less than one in the simulation. This is sensible, after all we were simulating samples coming from the very same distribution. On average there should be no differences between the means. So, on average the part of the total variance that is explained by the means should be less than one, or around one, because it should be roughly the same as the amount of error variance (remember, we are simulating no differences).

At the same time, we do see that some \(F\) -values are larger than 1. There are little bars that we can see going all the way up to about 5. If you were to get an \(F\) -value of 5, you might automatically think, that’s a pretty big \(F\) -value. Indeed it kind of is, it means that you can explain 5 times more of variance than you can’t explain. That seems like a lot. You can also see that larger \(F\) -values don’t occur very often. As a final reminder, what you are looking at is how the \(F\) -statistic (measured from each of 10,000 simulated experiments) behaves when the only thing that can cause differences in the means is random sampling error. Just by chance sometimes the means will be different. You are looking at another chance window. These are the \(F\) s that chance can produce.

7.3.1 Making Decisions

We can use the sampling distribution of \(F\) (for the null) to make decisions about the role of chance in a real experiment. For example, we could do the following.

  • Set an alpha criterion of \(p\) = 0.05
  • Find out the critical value for \(F\) , for our particular situation (with our \(df\) s for the numerator and denominator).

Let’s do that. I’ve drawn the line for the critical value onto the histogram in Figure  7.2 :

example anova research question

Alright, now we can see that only 5% of all \(F\) -values from from this sampling distribution will be 3.35 or larger. We can use this information.

How would we use it? Imagine we ran a real version of this experiment. And, we really used some pills that just might change smartness. If we ran the exact same design, with 30 people in total (10 in each group), we could set an \(F\) criterion of 3.35 for determining whether any of our results reflected a causal change in smartness due to the pills, and not due to random chance. For example, if we found an \(F\) -value of 3.34, which happens, just less than 5% of the time, we might conclude that random sampling error did not produce the differences between our means. Instead, we might be more confident that the pills actually did something, after all an \(F\) -value of 3.34 doesn’t happen very often, it is unlikely (only 5 times out of 100) to occur by chance.

7.3.2 Fs and means

Up to here we have been building your intuition for understanding \(F\) . We went through the calculation of \(F\) from sample data. We went through the process of simulating thousands of \(F\) s to show you the null distribution. We have not talked so much about what researchers really care about…The MEANS! The actual results from the experiment. Were the means different? that’s often what people want to know. So, now we will talk about the means, and \(F\) , together.

Notice, if I told you I ran an experiment with three groups, testing whether some manipulation changes the behavior of the groups, and I told you that I found a big \(F\) !, say an \(F\) of 6!. And, that the \(F\) of 6 had a \(p\) -value of .001. What would you know based on that information alone? You would only know that Fs of 6 don’t happen very often by chance. In fact they only happen 0.1% of the time, that’s hardly at all. If someone told me those values, I would believe that the results they found in their experiment were not likely due to chance. However, I still would not know what the results of the experiment were! Nobody told us what the means were in the different groups, we don’t know what happened!

IMPORTANT: even though we don’t know what the means were, we do know something about them, whenever we get \(F\) -values and \(p\) -values like that (big \(F\) s, and very small associated \(p\) s)… Can you guess what we know? I’ll tell you. We automatically know that there must have been some differences between the means . If there was no differences between the means, then the variance explained by the means (the numerator for \(F\) ) would not be very large. So, we know that there must be some differences, we just don’t know what they are. Of course, if we had the data, all we would need to do is look at the means for the groups (the ANOVA table doesn’t report this, we need to do it as a separate step).

7.3.2.1 ANOVA is an omnibus test

This property of the ANOVA is why the ANOVA is sometimes called the omnibus test . Omnibus is a fun word, it sounds like a bus I’d like to ride. The meaning of omnibus, according to the dictionary, is “comprising several items”. The ANOVA is, in a way, one omnibus test, comprising several little tests.

For example, if you had three groups, A, B, and C. You get could differences between

That’s three possible differences you could get. You could run separate \(t\) -tests, to test whether each of those differences you might have found could have been produced by chance. Or, you could run an ANOVA, like what we have been doing, to ask one more general question about the differences. Here is one way to think about what the omnibus test is testing:

Hypothesis of no differences anywhere: $ A = B = C $

Any differences anywhere:

  • $ A B = C $
  • $ A = B C $
  • $ A C = B $

The \(\neq\) symbol means “does not equal”, it’s an equal sign with a cross through it (no equals allowed!).

How do we put all of this together. Generally, when we get a small \(F\) -value, with a large \(p\) -value, we will not reject the hypothesis of no differences. We will say that we do not have evidence that the means of the three groups are in any way different, and the differences that are there could easily have been produced by chance. When we get a large F with a small \(p\) -value (one that is below our alpha criterion), we will generally reject the hypothesis of no differences. We would then assume that at least one group mean is not equal to one of the others. That is the omnibus test. Rejecting the null in this way is rejecting the idea there are no differences. But, the \(F\) test still does not tell you which of the possible group differences are the ones that are different.

7.3.2.2 Looking at a bunch of group means

We just ran 10,000 experiments and we didn’t even once look at the group means for any of the experiments. Different patterns of group means under the null are shown in Figure  7.3 for a subset of 10 random simulations.

example anova research question

Whoa, that’s a lot to look at. What is going on here? Each little box represents the outcome of a simulated experiment. The dots are the means for each group (whether subjects took 1 , 2, or 3 magic pills). The y-axis shows the mean smartness for each group. The error bars are standard errors of the mean.

You can see that each of the 10 experiments turn out different. Remember, we sampled 10 numbers for each group from the same normal distribution with mean = 100, and sd = 10. So, we know that the correct means for each sample should actually be 100 every single time. However, they are not 100 every single time because of?… sampling error (Our good friend that we talk about all the time).

For most of the simulations the error bars are all overlapping, this suggests visually that the means are not different. However, some of them look like they are not overlapping so much, and this would suggest that they are different. This is the siren song of chance (sirens lured sailors to their deaths at sea…beware of the siren call of chance). If we concluded that any of these sets of means had a true difference, we would be committing a type I error. Because we made the simulation, we know that none of these means are actually different. But, when you are running a real experiment, you don’t get to know this for sure.

7.3.2.3 Looking at bar graphs

Let’s look at the exact same graph as above, but this time use bars to visually illustrate the means, instead of dots. We’ll re-do our simulation of 10 experiments, so the pattern will be a little bit different:

example anova research question

In Figure  7.4 the heights of the bars display the means for each pill group. The pattern across simulations is generally the same. Some of the fake experiments look like there might be differences, and some of them don’t.

7.3.2.4 What mean differences look like when \(F\) is less than 1

We are now giving you some visual experience looking at what means look like from a particular experiment. This is for your stats intuition. We’re trying to improve your data senses.

What we are going to do now is similar to what we did before. Except this time we are going to look at 10 simulated experiments, where all of the \(F\) -values were less than 1. All of these \(F\) -values would also be associated with fairly large \(p\) -values. When F is less than one, we would not reject the hypothesis of no differences. So, when we look at patterns of means when F is less than 1, we should see mostly the same means, and no big differences.

example anova research question

In Figure  7.5 the numbers in the panels now tell us which simulations actually produced \(F\) s of less than 1.

We see here that all the bars aren’t perfectly flat, that’s OK. What’s more important is that for each panel, the error bars for each mean are totally overlapping with all the other error bars. We can see visually that our estimate of the mean for each sample is about the same for all of the bars. That’s good, we wouldn’t make any type I errors here.

7.3.2.5 What mean differences look like when F > 3.35

Earlier we found that the critical value for \(F\) in our situation was 3.35, this was the location on the \(F\) distribution where only 5% of \(F\) s were 3.35 or greater. We would reject the hypothesis of no differences whenever \(F\) was greater than 3.35. In this case, whenever we did that, we would be making a type I error. That is because we are simulating the distribution of no differences (remember all of our sample means are coming from the exact same distribution). So, now we can take a look at what type I errors look like. In other words, we can run some simulations and look at the pattern in the means, only when \(F\) happens to be 3.35 or greater (this only happens 5% of the time, so we might have to let the computer simulate for a while). Let’s see what that looks like:

example anova research question

The numbers in the panels now tell us which simulations actually produced \(F\) s that were greater than 3.35

What do you notice about the pattern of means inside each panel of Figure  7.6 ? Now, every the panels show at least one mean that is different from the others. Specifically, the error bars for one mean do not overlap with the error bars for one or another mean. This is what mistakes looks like. These are all type I errors. They are insidious. When they happen to you by chance, the data really does appear to show a strong pattern, your \(F\) -value is large, and your \(p\) -value is small! It is easy to be convinced by a type I error (it’s the siren song of chance).

7.4 ANOVA on Real Data

We’ve covered many fundamentals about the ANOVA, how to calculate the necessary values to obtain an \(F\) -statistic, and how to interpret the \(F\) -statistic along with it’s associate \(p\) -value once we have one. In general, you will be conducting ANOVAs and playing with \(F\) s and \(p\) s using software that will automatically spit out the numbers for you. It’s important that you understand what the numbers mean, that’s why we’ve spent time on the concepts. We also recommend that you try to compute an ANOVA by hand at least once. It builds character, and let’s you know that you know what you are doing with the numbers.

But, we’ve probably also lost the real thread of all this. The core thread is that when we run an experiment we use our inferential statistics, like ANOVA, to help us determine whether the differences we found are likely due to chance or not. In general, we like to find out that the differences that we find are not due to chance, but instead to due to our manipulation.

So, we return to the application of the ANOVA to a real data set with a real question. This is the same one that you will be learning about in the lab. We give you a brief overview here so you know what to expect.

7.4.1 Tetris and bad memories

Yup, you read that right. The research you will learn about tests whether playing Tetris after watching a scary movie can help prevent you from having bad memories from the movie ( James et al. 2015 ) . Sometimes in life people have intrusive memories, and they think about things they’d rather not have to think about. This research looks at one method that could reduce the frequency of intrusive memories.

Here’s what they did. Subjects watched a scary movie, then at the end of the week they reported how many intrusive memories about the movie they had. The mean number of intrusive memories was the measurement (the dependent variable). This was a between-subjects experiment with four groups. Each group of subjects received a different treatment following the scary movie. The question was whether any of these treatments would reduce the number of intrusive memories. All of these treatments occurred after watching the scary movie:

  • No-task control: These participants completed a 10-minute music filler task after watching the scary movie.
  • Reactivation + Tetris: These participants were shown a series of images from the trauma film to reactivate the traumatic memories (i.e., reactivation task). Then, participants played the video game Tetris for 12 minutes.
  • Tetris Only: These participants played Tetris for 12 minutes, but did not complete the reactivation task.
  • Reactivation Only: These participants completed the reactivation task, but did not play Tetris.

For reasons we elaborate on in the lab, the researchers hypothesized that the Reactivation+Tetris group would have fewer intrusive memories over the week than the other groups.

Let’s look at the findings. Note you will learn how to do all of these steps in the lab. For now, we just show the findings and the ANOVA table. Then we walk through how to interpret it.

example anova research question

OOooh, look at that. We did something fancy. Figure  7.7 shows the data from the four groups. The height of each bar shows the mean intrusive memories for the week. The dots show the individual scores for each subject in each group (useful to to the spread of the data). The error bars show the standard errors of the mean.

What can we see here? Right away it looks like there is some support for the research hypothesis. The green bar, for the Reactivation + Tetris group had the lowest mean number of intrusive memories. Also, the error bar is not overlapping with any of the other error bars. This implies that the mean for the Reactivation + Tetris group is different from the means for the other groups. And, this difference is probably not very likely by chance.

We can now conduct the ANOVA on the data to ask the omnibus question. If we get a an \(F\) -value with an associated \(p\) -value of less than .05 (the alpha criterion set by the authors), then we can reject the hypothesis of no differences. Let’s see what happens:

Df Sum Sq Mean Sq F value Pr(>F)
Condition 3 114.8194 38.27315 3.794762 0.0140858
Residuals 68 685.8333 10.08578 NA NA

We see the ANOVA table, it’s up there. We could report the results from the ANOVA table like this:

There was a significant main effect of treatment condition, F(3, 68) = 3.79, MSE = 10.08, p=0.014.

We called this a significant effect because the \(p\) -value was less than 0.05. In other words, the \(F\) -value of 3.79 only happens 1.4% of the time when the null is true. Or, the differences we observed in the means only occur by random chance (sampling error) 1.4% of the time. Because chance rarely produces this kind of result, the researchers made the inference that chance DID NOT produce their differences, instead, they were inclined to conclude that the Reactivation + Tetris treatment really did cause a reduction in intrusive memories. That’s pretty neat.

7.4.2 Comparing means after the ANOVA

Remember that the ANOVA is an omnibus test, it just tells us whether we can reject the idea that all of the means are the same. The F-test (synonym for ANOVA) that we just conducted suggested we could reject the hypothesis of no differences. As we discussed before, that must mean that there are some differences in the pattern of means.

Generally after conducting an ANOVA, researchers will conduct follow-up tests to compare differences between specific means. We will talk more about this practice throughout the textbook. There are many recommended practices for follow-up tests, and there is a lot of debate about what you should do. We are not going to wade into this debate right now. Instead we are going to point out that you need to do something to compare the means of interest after you conduct the ANOVA, because the ANOVA is just the beginning…It usually doesn’t tell you want you want to know. You might wonder why bother conducting the ANOVA in the first place…Not a terrible question at all. A good question. You will see as we talk about more complicated designs, why ANOVAs are so useful. In the present example, they are just a common first step. There are required next steps, such as what we do next.

How can you compare the difference between two means, from a between-subjects design, to determine whether or not the difference you observed is likely or unlikely to be produced by chance? We covered this one already, it’s the independent \(t\) -test. We’ll do a couple \(t\) -tests, showing the process.

7.4.2.1 Control vs. Reactivation+Tetris

What we really want to know is if Reactivation+Tetris caused fewer intrusive memories…but compared to what? Well, if it did something, the Reactivation+Tetris group should have a smaller mean than the Control group. So, let’s do that comparison:

We found that there was a significant difference between the control group (M=5.11) and Reactivation + Tetris group (M=1.89), t(34) = 2.99, p=0.005.

Above you just saw an example of reporting another \(t\) -test. This sentences does an OK job of telling the reader everything they want to know. It has the means for each group, and the important bits from the \(t\) -test.

More important, as we suspected the difference between the control and Reactivation + Tetris group was likely not due to chance.

7.4.2.2 Control vs. Tetris_only

Now we can really start wondering what caused the difference. Was it just playing Tetris? Does just playing Tetris reduce the number of intrusive memories during the week? Let’s compare that to control:

Here we did not find a significant difference. We found that no significant difference between the control group (M=5.11) and Tetris Only group (M=3.89), t(34) = 2.99, p=0.318.

So, it seems that not all of the differences between our means are large enough to be called statistically significant. In particular, the difference here, or larger, happens by chance 31.8% of the time.

You could go on doing more comparisons, between all of the different pairs of means. Each time conducting a \(t\) -test, and each time saying something more specific about the patterns across the means than you get to say with the omnibus test provided by the ANOVA.

Usually, it is the pattern of differences across the means that you as a researcher are primarily interested in understanding. Your theories will make predictions about how the pattern turns out (e.g., which specific means should be higher or lower and by how much). So, the practice of doing comparisons after an ANOVA is really important for establishing the patterns in the means.

7.5 ANOVA Summary

We have just finished a rather long introduction to the ANOVA, and the \(F\) -test. The next couple of chapters continue to explore properties of the ANOVA for different kinds of experimental designs. In general, the process to follow for all of the more complicated designs is very similar to what we did here, which boils down to two steps:

  • conduct the ANOVA on the data
  • conduct follow-up tests, looking at differences between particular means

So what’s next…the ANOVA for repeated measures designs. See you in the next chapter.

  • Basic Math Formulas
  • Trigonometry Formulas
  • Integration Formulas
  • Differentiation Formula
  • Algebra Formulas
  • Mensuration Formula
  • Statistics Formulas
  • Basic Geometry Formulas
  • Sequences and Series Formulas
  • Coordinate Geometry Formulas
  • Trigonometric Table
  • CBSE Class 8 Maths Formulas
  • CBSE Class 9 Maths Formulas
  • CBSE Class 10 Maths Formulas
  • CBSE Class 11 Maths Formulas

Number System

  • What is the Division Formula?
  • LCM Formula
  • Distributive Property
  • Consecutive Integers
  • Scientific Notation Formula
  • Binary Formula
  • Convert Binary fraction to Decimal
  • Fibonacci Sequence Formula
  • Direct Variation: Definition, Formula and Examples
  • What is Celsius Formula?
  • Fahrenheit to Celsius Formula (°F to °C)
  • Revenue Formula
  • Selling Price Formula
  • How to calculate the Discount?
  • Simple Interest
  • Compound Interest Formula
  • Monthly Compound Interest Formula
  • Daily Compound Interest Formula with Examples
  • Double Time Formula

Basic Geometry

  • Perpendicular Lines
  • Right Angle
  • What is Parallel Lines Formula?
  • Angles Formula
  • Degrees To Radians Calculator
  • Area of 2D Shapes
  • Area of Quadrilateral
  • Area of Square
  • What is the Diameter Formula?
  • Arc Length Formula
  • Central Angle of Circle Formula with Solved Examples
  • Asymptote Formula
  • Axis of Symmetry of a Parabola
  • Centroid of a Trapezoid Formula
  • Area of a Circle: Formula, Derivation, Examples
  • Parallelogram Formulas
  • Perimeter Formulas for Geometric Shapes
  • Perimeter of Triangle
  • Equilateral Triangle
  • Scalene Triangle: Definition, Properties, Formula, Examples
  • Right Angled Triangle | Properties and Formula
  • Perimeter of Rectangle
  • What is the Formula for Perimeter of a Square?
  • Circumference Formula
  • Perimeter of a Parallelogram
  • Rhombus Formula
  • Perimeter of Rhombus Formula
  • Diagonal Formula
  • Diagonal of a Polygon Formula
  • Diagonal of a Square Formula
  • Diagonal of Parallelogram Formula
  • Diagonal of a Cube Formula
  • Euclid Euler Theorem
  • What is Side Angle Side Formula?
  • Polygon Formula - Definition, Symbol, Examples

Mensuration

  • Annulus Area Formula
  • Volume Formulas for 3D Shapes
  • Volume of a Cube
  • Volume of a Cylinder: Formula, Definition and Examples
  • Volume of Cone: Formula, Derivation and Examples
  • Volume of a Sphere
  • Surface Area Formulas
  • Surface Area of Cone
  • Surface Area of Sphere: Formula, Derivation and Solved Examples
  • Surface Area of a Square Pyramid
  • Volume of a Pyramid Formula
  • Frustum of Cone
  • Volume of a Square Pyramid Formula
  • Surface Area of a Prism
  • Frustum of a Regular Pyramid Formula
  • Polynomial Formula
  • Factorization of Polynomial
  • What is Factoring Trinomials Formula?
  • a2 - b2 Formula
  • Difference of Cubes
  • Discriminant Formula in Quadratic Equations
  • Sum of Arithmetic Sequence Formula
  • Function Notation Formula
  • Binomial Distribution in Probability
  • Binomial Expansion Formula
  • Binomial Theorem
  • FOIL Method
  • Exponential Decay Formula
  • Factorial Formula
  • Combinations Formula with Examples
  • Fourier Series Formula
  • Maclaurin series

Coordinate Geometry

  • Mid Point Formula
  • Equation of a Straight Line
  • Equation of a Circle
  • Ellipse Formula

Trigonometry

  • 30-60-90 Formula
  • Cofunction Formulas
  • What is Cos Square theta Formula?
  • What are Cosine Formulas?
  • Cosecant Formula
  • Cotangent Formula
  • Tangent Formulas
  • Cot Half Angle Formula
  • 2cosacosb Formula
  • Multiple Angle Formulas
  • Double Angle Formula for Cosine
  • Inverse Trigonometric Functions

Complex Number

  • Complex Number Formula
  • Absolute Value of a Complex Number
  • Complex Number Power Formula
  • DeMoivre's Theorem
  • Covariance Matrix
  • Determinant of a Matrix with Solved Examples
  • Limit Formula
  • Average and Instantaneous Rate of Change
  • Calculus | Differential and Integral Calculus
  • Total Derivative
  • Difference Quotient Formula
  • Chain Rule: Theorem, Formula and Solved Examples
  • Implicit Differentiation
  • Antiderivatives
  • Integration by Parts
  • Integration by Substitution Formula
  • Definite Integral | Definition, Formula & How to Calculate
  • Area Under Curve
  • Differentiation and Integration Formula
  • Differential Equations
  • Magnitude of a Vector
  • Direction of a Vector Formula
  • Dot and Cross Products on Vectors
  • Cross Product of two Vectors

Probability

  • Tossing a Coin Probability Formula
  • Conditional Probability
  • Empirical Probability
  • Bayes' Theorem
  • Bernoulli Trials and Binomial Distribution
  • Logarithm Formula
  • Change of Base Formula

Statistics Formula

  • Frequency Distribution - Table, Graphs, Formula
  • Circle Graph : Definition, Types and Examples
  • Mean, Median and Mode
  • Mean Deviation Formula
  • Mean Absolute Deviation
  • Average Deviation Formula
  • Degrees of Freedom

Anova Formula

  • Central Limit Theorem
  • Coefficient of Determination Formula
  • Coefficient of Variation Formula
  • Linear Regression Formula
  • Pearson Correlation Coefficient

ANOVA Test (Analysis of Variance) is used to compare the means of different groups using various estimate methodologies. ANOVA is an abbreviation for the analysis of variance. The ANOVA analysis is a statistical relevance tool designed to evaluate whether or not the null hypothesis can be rejected while testing hypotheses. It is used to determine whether or not the means of three or more groups are equal.

Whenever there are more than two or more independent groups, the ANOVA test is used. The ANOVA test is used to look for heterogeneity within groups as well as variability across groupings. The f test returns the ANOVA test statistic.

Table of Content

ANOVA Formula

Examples of the use of anova formula.

ANOVA Table

Types of ANOVA  Formula

Anova formula example, anova  formula – faqs.

ANOVA formula is made up of numerous parts. The best way to tackle an ANOVA test problem is to organize the formulae inside an ANOVA table. Below are the ANOVA formulae.

Source of Variation

Sum of Squares

Degree of Freedom

Mean Squares

F Value

SSB = Σnj(X̄ – X̄) df = k – 1MSB = SSB / (k – 1)

f = MSB / MSE

or, F = MST/MSE

SSE = Σnj(X̄- X̄ ) df = N – kMSE = SSE / (N – k) 
SST = SSB + SSEdf = N – 1  
  • F = ANOVA Coefficient
  • MSB = Mean of the total of squares between groupings
  • MSW = Mean total of squares within groupings
  • MSE = Mean sum of squares due to error
  • SST = total Sum of squares
  • p = Total number of populations
  • n = The total number of samples in a population
  • SSW = Sum of squares within the groups
  • SSB = Sum of squares between the groups
  • SSE = Sum of squares due to error
  • s = Standard deviation of the samples
  • N = Total number of observations
  • Assume it is necessary to assess whether consuming a specific type of tea will result in a mean weight decrease. Allow three groups to use three different varieties of tea: green tea, Earl Grey tea, and Jasmine tea. Thus, the ANOVA test (one way) will be utilized to examine if there was any mean weight decrease displayed by a certain group.
  • Assume a poll was undertaken to see if there is a relationship between salary and gender and stress levels during job interviews. A two-way ANOVA will be utilized to carry out such a test.

An ANOVA (Analysis of Variance) test table is used to summarize the results of an ANOVA test, which is used to determine if there are any statistically significant differences between the means of three or more independent groups. Here’s a general structure of an ANOVA table:

ANOVA-Table

  • One-Way ANOVA: This test is used to see if there is a variation in the mean values of three or more groups. Such a test is used where the data set has only one independent variable. If the test statistic exceeds the critical value, the null hypothesis is rejected, and the averages of at least two different groups are significant statistically.
  • Two-Way ANOVA: Two independent variables are used in the two-way ANOVA. As a result, it can be viewed as an extension of a one-way ANOVA in which only one variable influences the dependent variable. A two-way ANOVA test is used to determine the main effect of each independent variable and whether there is an interaction effect. Each factor is examined independently to determine the main effect, as in a one-way ANOVA. Furthermore, all components are analyzed at the same time to test the interaction impact.

Articles Related to ANOVA  Formula:

Variance and Standard Deviation How to Calculate Variance? Frequency Distribution

Example 1: Three different kinds of food are tested on three groups of rats for 5 weeks. The objective is to check the difference in mean weight(in grams) of the rats per week. Apply one-way ANOVA using a 0.05 significance level to the following data:

Food IFood IIFood III
8411
1258
1947
8613
697
1179
H 0 : μ 1 = μ 2 =μ 3 H 1 : The means are not equal Since, X̄ 1 = 5, X̄ 2 = 9, X̄ 3 = 10 Total mean = X̄ = 8 SSB = 6(5 – 8) 2 + 6(9 – 8) 2 + 6(10 – 8) 2 = 84 SSE = 68 MSB = SSB/df 1 = 42 MSE = SSE/df 2 = 4.53 f = MSB/MSE = 42/4.53 = 9.33 Since f > F, the null hypothesis stands rejected.

Example 2: Calculate the ANOVA coefficient for the following data:

PlantNumberAverage spans
Hibiscus5122
Marigold5161
Rose5204
Plant n x s s 2 Hibiscus 5 12 2 4 Marigold 5 16 1 1 Rose 5 20 4 16 p = 3 n = 5 N = 15 x̄ = 16 SST = Σn(x−x̄) 2 SST= 5(12 − 16) 2 + 5(16 − 16) 2 + 11(20 − 16) 2 = 160 MST = SST/p-1 = 160/3-1 = 80 SSE = ∑ (n−1) = 4 (4 + 1) + 4(16) = 84 MSE = 7 F = MST/MSE = 80/7   F = 11.429

Example 3: The following data show the number of worms quarantined from the GI areas of four groups of muskrats in a carbon tetrachloride anthelmintic study. Conduct a two-way ANOVA test.

IIIIIIIV
338412124389
324387353432
268400469255
147233222133
309212111265
Source of Variation Sum of Squares Degrees of Freedom Mean Square Between the groups 62111.6 8 9078.067 Within the groups 98787.8 16 4567.89 Total 167771.4 24   Since F = MST / MSE            = 9.4062 / 3.66         F = 2.57

Example 4: Enlist the results in APA format after performing ANOVA on the following data set:

[Tex]\begin{bmatrix}  \textbf{n} & \textbf{mean} & \textbf{sd} \\  30 & 50.26 & 10.45 \\  30 & 45.32 & 12.76 \\  30 & 53.67 & 11.47 \\ \end{bmatrix} [/Tex]

Variance of first set = (10.45) 2 = 109.2 Variance of second set = (12.76) 2 = 162.82 Variance of third set = (11.47) 2 = 131.56 MS error = {109.2 + 162.82 + 131.56} / {3}            = 134.53 MS between = (17.62)(30) = 528.75 F = MS between   /  MS error    = 528.75 / 134.53 F = 4.86 APA writeup: F(2, 87)=3.93, p >=0.01, η 2 =0.08.

How does one set the hypothesis for ANOVA?

The equality of the means of distinct groups must be tested in an ANOVA test. As a result, the hypothesis is as follows: H 0 = 1 = 2 = 3 =… = k = Null Hypothesis H 1 is Alternative Hypothesis: The means are not equal.

How do you calculate the ANOVA?

ANOVA compares group means’ differences. Calculate Grand Mean, Between-Group Variability (SSB), and Within-Group Variability (SSW). Determine significance through variance comparison.

What is meant by ANOVA statistic?

The sample mean of the jth treatment of a grouping or a mass data sample is called the ANOVA statistic. It is denoted by the alphabet f. 

What is the p value in ANOVA?

In ANOVA, a shared P-value is initially obtained. A significant P-value in the ANOVA test suggests statistical significance in at least one pair’s mean difference. Multiple comparisons are then employed to identify these significant pair(s).

What do you mean by one-way ANOVA?

One-way ANOVA is a form of ANOVA test that is used when just one independent variable is present. It compares the means of the various test groups. A test of this type can only provide information on the statistical significance of the means; it cannot establish which groups have different means.

What is accuracy of the ANOVA test?

Since it is more versatile and requires fewer observations, ANOVA analysis is sometimes thought to be more accurate than t-testing. It is also more suited to employ in more sophisticated studies than those that can be evaluated by testing.

Please Login to comment...

Similar reads.

  • Maths-Formulas
  • Mathematics
  • School Learning

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Statology

Statistics Made Easy

Repeated Measures ANOVA: Definition, Formula, and Example

A  repeated measures ANOVA  is used to determine whether or not there is a statistically significant difference between the means of three or more groups in which the same subjects show up in each group.

A repeated measures ANOVA is typically used in two specific situations:

1. Measuring the mean scores of subjects during three or more time points. For example, you might want to measure the resting heart rate of subjects one month before they start a training program, during the middle of the training program, and one month after the training program to see if there is a significant difference in mean resting heart rate across these three time points.

One-way repeated measures anova example

2. Measuring the mean scores of subjects under three different conditions. For example, you might have subjects watch three different movies and rate each one based on how much they enjoyed it. 

One-way repeated measures ANOVA example dataset

One-Way ANOVA vs. Repeated Measures ANOVA

In a typical one-way ANOVA , different subjects are used in each group. For example, we might ask subjects to rate three movies, just like in the example above, but we use different subjects to rate each movie:

One-way ANOVA example

In this case, we would conduct a typical one-way ANOVA to test for the difference between the mean ratings of the three movies. 

In real life there are two benefits of using the same subjects across multiple treatment conditions:

1. It’s cheaper and faster for researchers to recruit and pay a smaller number of people to carry out an experiment since they can just obtain data from the same people multiple times. 

2.  We are able to attribute some of the variance in the data to the subjects themselves, which makes it easier to obtain a smaller p-value.

One potential drawback of this type of design is that subjects might get bored or tired if an experiment lasts too long, which could skew the results. For example, subjects might give lower movie ratings to the third movie they watch because they’re tired and ready to go home.

Repeated Measures ANOVA: Example

Suppose we recruit five subjects to participate in a training program. We measure their resting heart rate before participating in a training program, after participating for 4 months, and after participating for 8 months. 

The following table shows the results:

One-way repeated measures ANOVA dataset

We want to know whether there is a difference in mean resting heart rate at these three time points so we conduct a repeated measures ANOVA at the .05 significance level using the following steps:

Step 1. State the hypotheses. 

The null hypothesis (H 0 ):  µ 1 = µ 2 = µ 3 (the population means are all equal)

The alternative hypothesis: (Ha): at least one population mean is different from the rest

Step 2. Perform the repeated measures ANOVA.

We will use the Repeated Measures ANOVA Calculator using the following input:

One way repeated measures ANOVA calculator

Once we click “Calculate” then the following output will automatically appear:

Repeated measures ANOVA output

Step 3. Interpret the results. 

From the output table we see that the F test statistic is  9.598  and the corresponding p-value is  0.00749 .

Since this p-value is less than 0.05, we reject the null hypothesis. This means we have sufficient evidence to say that there is a statistically significant difference between the mean resting heart rate at the three different points in time.

Additional Resources

The following articles explain how to perform a repeated measures ANOVA using different statistical softwares:

Repeated Measures ANOVA in Excel Repeated Measures ANOVA in R Repeated Measures ANOVA in Stata Repeated Measures ANOVA in Python Repeated Measures ANOVA in SPSS Repeated Measures ANOVA in Google Sheets Repeated Measures ANOVA By Hand Repeated Measures ANOVA Calculator

Featured Posts

example anova research question

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

2 Replies to “Repeated Measures ANOVA: Definition, Formula, and Example”

thank you for the easy to understand explanation. are there post-hoc tests like the bonferroni for one way anova?

This was so helpful, thankyou. However, wanted to ask if the alternative non parametrics tests determines mean differences or MEDIAN.?

Somewhere, I read that non parametric are used to determine median difference since it is non outlier.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

IMAGES

  1. SOLUTION: ANOVA Test Questions & Answers

    example anova research question

  2. One-way ANOVA

    example anova research question

  3. Over ANOVA. An example research question with…

    example anova research question

  4. One Way Anova

    example anova research question

  5. Statistics One Way ANOVA Hypothesis Test-including StatCrunch

    example anova research question

  6. 2 Two-Way ANOVA

    example anova research question

VIDEO

  1. 33 Example of Anova

  2. ANOVA ( Analysis of variance) in business research methodology

  3. Which type of ANOVA do I use? One Way, Two Way, Repeated Measures ANOVA, MANOVA, or ANCOVA

  4. Data Analysis

  5. Concept Of ANOVA With Example Problems

  6. A Study about Time Spent with Partner & Relationship Satisfaction (ANOVA)

COMMENTS

  1. 4 Examples of Using ANOVA in Real Life

    ANOVA Real Life Example #1. A large scale farm is interested in understanding which of three different fertilizers leads to the highest crop yield. They sprinkle each fertilizer on ten different fields and measure the total yield at the end of the growing season. To understand whether there is a statistically significant difference in the mean ...

  2. One-way ANOVA

    Use a one-way ANOVA when you have collected data about one categorical independent variable and one quantitative dependent variable. The independent variable should have at least three levels (i.e. at least three different groups or categories). ANOVA tells you if the dependent variable changes according to the level of the independent variable.

  3. ANOVA (Analysis of variance)

    Analysis of Variance (ANOVA) is a statistical method used to test differences between two or more means. It is similar to the t-test, but the t-test is generally used for comparing two means, while ANOVA is used when you have more than two means to compare. ANOVA is based on comparing the variance (or variation) between the data samples to the ...

  4. ANOVA Test: Definition, Types, Examples, SPSS

    The ANOVA Test. An ANOVA test is a way to find out if survey or experiment results are significant. In other words, they help you to figure out if you need to reject the null hypothesis or accept the alternate hypothesis. Basically, you're testing groups to see if there's a difference between them.

  5. The Ultimate Guide to ANOVA

    The Ultimate Guide to ANOVA. ANOVA is the go-to analysis tool for classical experimental design, which forms the backbone of scientific research. In this article, we'll guide you through what ANOVA is, how to determine which version to use to evaluate your particular experiment, and provide detailed examples for the most common forms of ANOVA.

  6. One-Way ANOVA: Example

    The eta squared formula for one-way analysis of variance is: η 2 = SSB / SST. where SSB is the between-groups sum of squares and SST is the total sum of squares. Given this formula, we can compute eta squared for this drug dosage experiment, as shown below: η 2 = SSB / SST = 6240 / 15240 = 0.41.

  7. One-Way ANOVA: Definition, Formula, and Example

    The question is whether or not this difference is statistically significant. Fortunately, a one-way ANOVA allows us to answer this question. One-Way ANOVA: Assumptions. For the results of a one-way ANOVA to be valid, the following assumptions should be met: 1. Normality - Each sample was drawn from a normally distributed population. 2.

  8. Two-Way ANOVA

    When to use a two-way ANOVA. You can use a two-way ANOVA when you have collected data on a quantitative dependent variable at multiple levels of two categorical independent variables.. A quantitative variable represents amounts or counts of things. It can be divided to find a group mean. Bushels per acre is a quantitative variable because it represents the amount of crop produced.

  9. One Way ANOVA Overview & Example

    One-way ANOVA assumes your group data follow the normal distribution. However, your groups can be skewed if your sample size is large enough because of the central limit theorem. Here are the sample size guidelines: 2 - 9 groups: At least 15 in each group. 10 - 12 groups: At least 20 per group. For one-way ANOVA, unimodal data can be mildly ...

  10. Lesson 10: Introduction to ANOVA

    In this Lesson, we introduce Analysis of Variance or ANOVA. ANOVA is a statistical method that analyzes variances to determine if the means from more than two populations are the same. In other words, we have a quantitative response variable and a categorical explanatory variable with more than two levels. In ANOVA, the categorical explanatory ...

  11. ANOVA Test Statistics: Analysis of Variance

    An example of a one-way ANOVA includes testing a therapeutic intervention (CBT, medication, placebo) on the incidence of depression in a clinical sample. Note: Both the One-Way ANOVA and the Independent Samples t-Test can compare the means for two groups. However, only the One-Way ANOVA can compare the means across three or more groups.

  12. Analysis of variance (ANOVA)

    See three examples of ANOVA in action as you learn how it can be applied to more complex statistical analyses. Analysis of variance, or ANOVA, is an approach to comparing data with multiple means across different groups, and allows us to see patterns and trends within complex and varied data. See three examples of ANOVA in action as you learn ...

  13. Understanding ANOVA: Analyzing Variance in Multiple Groups

    ANOVA, short for Analysis of Variance, is a statistical method used to see if there are significant differences between the averages of three or more unrelated groups. This technique is especially useful when comparing more than two groups, which is a limitation of other tests like the t-test and z-test. For example, ANOVA can compare average ...

  14. PDF One-Way ANOVA Exam Practice

    C8057 (Research Methods II): One-Way ANOVA Exam Practice Dr. Andy Field Page 1 4/18/2007 One-Way Independent ANOVA: Exam Practice Sheet Questions Question 1 Students were given different drug treatments before revising for their exams. Some were given a memory drug, some a placebo drug and some no treatment. The exam scores (%) are

  15. Hypothesis Testing

    For example, suppose a clinical trial is designed to compare five different treatments for joint pain in patients with osteoarthritis. Investigators might also hypothesize that there are differences in the outcome by sex. This is an example of a two-factor ANOVA where the factors are treatment (with 5 levels) and sex (with 2 levels).

  16. ANOVA: Complete guide to Statistical Analysis & Applications

    Step 6: Select "Significance analysis", "Group Means" and "Multiple Anova". Step 7: Select an Output Range. Step 8: Select an alpha level. In most cases, an alpha level of 0.05 (5 percent) works for most tests. Step 9: Click "OK" to run. The data will be returned in your specified output range.

  17. Conduct and Interpret a One-Way ANOVA

    Examples for typical questions the ANOVA answers are as follows: ... As described in the research question we want to test, the math test score is our dependent variable and the exam result is our independent variable. This would be enough for a basic analysis. But the dialog box has a couple more options around Contrasts, post hoc tests (also ...

  18. The research question and the one-way ANOVA model

    Current: page 17: The research question and the one-way ANOVA model The research question and the one-way ANOVA model ... The ANOVA model above tells us that the response variable (yield in our example) depends on: A part we can explain by the treatment effect (τ_i ), the insecticide.

  19. Answering questions with data

    In practice, we will combine both the ANOVA test and \(t\)-tests when analyzing data with many sample means (from more than two groups or conditions). Just like the \(t\)-test, there are different kinds of ANOVAs for different research designs. There is one for between-subjects designs, and a slightly different one for repeated measures designs.

  20. PDF ANOVA Examples STAT 314

    Step 6 : Since 20.0142 ≥ 6.93 (p-value ≤ 0.01), we shall reject the null hypothesis. At the = 0.01 level of significance, there exists enough evidence to α conclude that there is an effect due to door color. 13. The table below shows the observed pollution indexes of air samples in two areas of a city.

  21. Two-Way ANOVA: Definition, Formula, and Example

    by Zach Bobbitt December 30, 2018. A two-way ANOVA ("analysis of variance") is used to determine whether or not there is a statistically significant difference between the means of three or more independent groups that have been split on two variables (sometimes called "factors"). This tutorial explains the following: When to use a two ...

  22. Anova Formula in Statistics with Solved Examples and FAQs

    Types of ANOVA Formula. One-Way ANOVA: This test is used to see if there is a variation in the mean values of three or more groups. Such a test is used where the data set has only one independent variable. If the test statistic exceeds the critical value, the null hypothesis is rejected, and the averages of at least two different groups are significant statistically.

  23. Repeated Measures ANOVA: Definition, Formula, and Example

    A repeated measures ANOVA is typically used in two specific situations: 1. Measuring the mean scores of subjects during three or more time points. For example, you might want to measure the resting heart rate of subjects one month before they start a training program, during the middle of the training program, and one month after the training ...