By John M Quick

The R Tutorial Series provides a collection of user-friendly tutorials to people who want to learn how to use R for statistical analysis.


My Statistical Analysis with R book is available from Packt Publishing and Amazon.


R Tutorial Series: One-Way Omnibus ANOVA

Testing the omnibus hypothesis via one-way ANOVA is simple process in R. This tutorial will explore how R can be used to perform a one-way ANOVA to test the difference between two (or more) group means.

Tutorial Files

Before we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. This dataset contains a hypothetical sample of 60 participants, who are divided into two groups (control and treatment) of 30. The values represent a scale that ranges from 1 to 5. For instance, this dataset could be conceptualized as a comparison between two professional training programs, where the control group participated the company's longstanding program and the treatment group participated in an experimental program. The values could represent the attitudes of employees towards the training programs on a scale from 1 (poor) to 5 (excellent).

Beginning Steps

To begin, we need to read our dataset into R and store its contents in a variable.
  1. > #read the one-way ANOVA dataset into an R variable using the read.csv(file) function
  2. > dataOneWay <- read.csv("dataset_ANOVA_OneWay.csv")
  3. > #display the data
  4. > dataOneWay

The first ten rows of our one-way ANOVA dataset.

One-Way ANOVA

Now that our data are ready, we can conduct a one-way omnibus ANOVA test using the anova(object) function.
  1. > #use anova(object) to test the omnibus hypothesis in one-way ANOVA
  2. > #is the difference between the group means statistically significant?
  3. > anova(lm(Values ~ Group, dataOneWay))

Our one-way ANOVA table.

The output of our ANOVA test indicates that the difference between our group means is statistically significant (p < .001). Conceptually, this suggests that employee attitudes towards the experimental training program were significantly higher than their attitudes towards the preexisting program.
Note that the object argument in our anova(object) function contained a linear model generated by the lm(formula, data) function. This is the same type of model that is used when conducting linear regression in R. A more detailed explanation of the lm(formula, data) function and examples of its use are available in my Simple Linear Regression article.

One-Way Multiple Group ANOVA

Conducting a one-way omnibus ANOVA with multiple groups is identical to the demonstrated two-group test. The only difference is that the values in your dataset would be associated with more than two groups. Subsequently, the omnibus hypothesis would test for mean differences across all of the groups. The anova(object) function and its contained lm(formula, data) function would remain the same.

Complete One-Way Omnibus ANOVA Example

To see a complete example of how a one-way omnibus ANOVA can be conducted in R, please download the one-way ANOVA example (.txt) file.

4 comments:

  1. John, I think our package granova deserves mention here, as well as some trials. The one-way function, granova.1w presents data in a novel graphic that makes clear to students how extant data play out with respect to the question that drives this method. AFAIK, this is the ONLY graphic that does this. Numerical results, consistent w/ what you present are also provided. The .2w version produces a dynamic graphic, and students, as well as many faculty, find it especially useful to 'see' an anova (for the first time, so they say). Best, Bob rmpruzek@yahoo.com

    ReplyDelete
  2. Nice visualization in 'granova'.
    For sake of completeness, one should maybe also mention function TukeyHSD:

    library(MASS)
    wt.gain <- anorexia[,3] - anorexia[,2]
    AOV <- aov(wt.gain ~ anorexia[, 1])
    plot(TukeyHSD(AOV))

    Greets,
    -ans

    ReplyDelete
  3. Thanks for your comments. One thing to note is that my tutorials are intended to help people learn how to accomplish things in R. They are not intended to describe every possible way that something can be accomplished in R. That said, I do invite you to write either guest sections (e.g. demonstrate TukeyHSD on this dataset) or guest posts (e.g. visualizing ANOVA with granova) that can be added to this blog. Contact me using the button on my main blog (www.johnmquick.com) if you are interested in contributing.

    ReplyDelete
  4. May I know the code for using Granova.1w with the sample data?

    ReplyDelete