Tutorial FilesBefore we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. This dataset contains a hypothetical sample of 30 participants whose interest in voting was measured at three different ages (10, 15, and 20). The interest values are represented on a scale that ranges from 1 to 5 and indicate how interested each participant was in voting at each given age.
Data SetupNotice that our data are arranged differently for a repeated measures ANOVA. In a typical one-way ANOVA, we would place all of the values of our independent variable in a single column and identify their respective levels with a second column, as demonstrated in this sample one-way dataset. In a repeated measures ANOVA, we instead treat each level of our independent variable as if it were a variable, thus placing them side by side as columns. Hence, rather than having one vertical column for voting interest, with a second column for age, we have three separate columns for voting interest, one for each age level.
Beginning StepsTo begin, we need to read our dataset into R and store its contents in a variable.
- > #read the dataset into an R variable using the read.csv(file) function
- > dataOneWayRepeatedMeasures <- read.csv("dataset_ANOVA_OneWayRepeatedMeasures.csv")
- > #display the data
- > #notice the atypical column arrangement for repeated measures data
- > dataOneWayRepeatedMeasures
Preparing the Repeated Measures FactorPrior to executing our analysis, we must follow a small series of steps in order to prepare our repeated measures factor.
Step 1: Define the Levels
- > #use c() to create a vector containing the number of levels within the repeated measures factor
- > #create a vector numbering the levels for our three voting interest age groups
- > ageLevels <- c(1, 2, 3)
Step 2: Define the Factor
- > #use as.factor() to create a factor using the level vector from step 1
- > #convert the age levels into a factor
- > ageFactor <- as.factor(ageLevels)
Step 3: Define the Frame
- > #use data.frame() to create a data frame using the factor from step 2
- > #convert the age factor into a data frame
- > ageFrame <- data.frame(ageFactor)
Step 4: Bind the Columns
- > #use cbind() to bind the levels of the factor from the original dataset
- > #bind the age columns
- > ageBind <- cbind(dataOneWayRepeatedMeasures$Interest10, dataOneWayRepeatedMeasures$Interest15, dataOneWayRepeatedMeasures$Interest20)
Step 5: Define the Model
- > #use lm() to generate a linear model using the bound factor levels from step 4
- > #generate a linear model using the bound age levels
- > ageModel <- lm(ageBind ~ 1)
Employing the Anova(mod, idata, idesign) FunctionTypically, researchers will choose one of several techniques for analyzing repeated measures data, such as an epsilon-correction method, like Huynh-Feldt or Greenhouse-Geisser, or a multivariate method, like Wilks' Lambda or Hotelling's Trace. Conveniently, having already prepared our data, we can employ a single Anova(mod, idata, idesign) function from the car package to yield all of the relevant repeated measures results. This allows us simplicity in that only a single function is required, regardless of the technique that we wish to employ. Thus, witnessing our outcomes becomes as simple as locating the desired method in the cleanly printed results.
Our Anova(mod, idata, idesign) function will be composed of three arguments. First, mod will contain our linear model from Step 5 in the preceding section. Second, idata will contain our data frame from Step 3. Third, idesign will contain our factor from Step 2, preceded by a tilde (~). Thus, our final function takes on the following form.
- > #load the car package (install first, if necessary)
- > #compose the Anova(mod, idata, idesign) function
- > analysis <- Anova(ageModel, idata = ageFrame, idesign = ~ageFactor)
Visualizing the ResultsFinally, we can use the summary(object) function to visualize the results of our repeated measures ANOVA.
- > #use summary(object) to visualize the results of the repeated measures ANOVA
- > summary(analysis)
Supposing that we are interested in the Wilks' Lambda method, we can see that the differences in the means for voting interest at ages 10, 15, and 20 are statistically significant (p < .001).