By John M Quick

The R Tutorial Series provides a collection of user-friendly tutorials to people who want to learn how to use R for statistical analysis.


My Statistical Analysis with R book is available from Packt Publishing and Amazon.


R Tutorial Series: One-Way Repeated Measures ANOVA

Repeated measures data require a different analysis procedure than our typical one-way ANOVA and subsequently follow a different R process. This tutorial will demonstrate how to conduct one-way repeated measures ANOVA in R using the Anova(mod, idata, idesign) function from the car package.

Tutorial Files

Before we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. This dataset contains a hypothetical sample of 30 participants whose interest in voting was measured at three different ages (10, 15, and 20). The interest values are represented on a scale that ranges from 1 to 5 and indicate how interested each participant was in voting at each given age.

Data Setup

Notice that our data are arranged differently for a repeated measures ANOVA. In a typical one-way ANOVA, we would place all of the values of our independent variable in a single column and identify their respective levels with a second column, as demonstrated in this sample one-way dataset. In a repeated measures ANOVA, we instead treat each level of our independent variable as if it were a variable, thus placing them side by side as columns. Hence, rather than having one vertical column for voting interest, with a second column for age, we have three separate columns for voting interest, one for each age level.

Beginning Steps

To begin, we need to read our dataset into R and store its contents in a variable.
  1. > #read the dataset into an R variable using the read.csv(file) function
  2. > dataOneWayRepeatedMeasures <- read.csv("dataset_ANOVA_OneWayRepeatedMeasures.csv")
  3. > #display the data
  4. > #notice the atypical column arrangement for repeated measures data
  5. > dataOneWayRepeatedMeasures

The first ten rows of our dataset

Preparing the Repeated Measures Factor

Prior to executing our analysis, we must follow a small series of steps in order to prepare our repeated measures factor.

Step 1: Define the Levels

  1. > #use c() to create a vector containing the number of levels within the repeated measures factor
  2. > #create a vector numbering the levels for our three voting interest age groups
  3. > ageLevels <- c(1, 2, 3)

Step 2: Define the Factor

  1. > #use as.factor() to create a factor using the level vector from step 1
  2. > #convert the age levels into a factor
  3. > ageFactor <- as.factor(ageLevels)

Step 3: Define the Frame

  1. > #use data.frame() to create a data frame using the factor from step 2
  2. > #convert the age factor into a data frame
  3. > ageFrame <- data.frame(ageFactor)

Step 4: Bind the Columns

  1. > #use cbind() to bind the levels of the factor from the original dataset
  2. > #bind the age columns
  3. > ageBind <- cbind(dataOneWayRepeatedMeasures$Interest10, dataOneWayRepeatedMeasures$Interest15, dataOneWayRepeatedMeasures$Interest20)

Step 5: Define the Model

  1. > #use lm() to generate a linear model using the bound factor levels from step 4
  2. > #generate a linear model using the bound age levels
  3. > ageModel <- lm(ageBind ~ 1)

Employing the Anova(mod, idata, idesign) Function

Typically, researchers will choose one of several techniques for analyzing repeated measures data, such as an epsilon-correction method, like Huynh-Feldt or Greenhouse-Geisser, or a multivariate method, like Wilks' Lambda or Hotelling's Trace. Conveniently, having already prepared our data, we can employ a single Anova(mod, idata, idesign) function from the car package to yield all of the relevant repeated measures results. This allows us simplicity in that only a single function is required, regardless of the technique that we wish to employ. Thus, witnessing our outcomes becomes as simple as locating the desired method in the cleanly printed results.
Our Anova(mod, idata, idesign) function will be composed of three arguments. First, mod will contain our linear model from Step 5 in the preceding section. Second, idata will contain our data frame from Step 3. Third, idesign will contain our factor from Step 2, preceded by a tilde (~). Thus, our final function takes on the following form.
  1. > #load the car package (install first, if necessary)
  2. library(car)
  3. > #compose the Anova(mod, idata, idesign) function
  4. > analysis <- Anova(ageModel, idata = ageFrame, idesign = ~ageFactor)

Visualizing the Results

Finally, we can use the summary(object) function to visualize the results of our repeated measures ANOVA.
  1. > #use summary(object) to visualize the results of the repeated measures ANOVA
  2. > summary(analysis)

Relevant segment of repeated measures ANOVA results

Supposing that we are interested in the Wilks' Lambda method, we can see that the differences in the means for voting interest at ages 10, 15, and 20 are statistically significant (p < .001).

Pairwise Comparisons

Note that we could conduct follow-up comparisons on our age factor to determine which age level means are significantly different from one another. For this purpose, it is recommended that the data be rearranged into the standard ANOVA format that we have used throughout our other tutorials. Subsequently, we could conduct pairwise comparisons in the same manner as demonstrated in the One-Way ANOVA with Comparisons tutorial.

Complete One-Way Repeated Measures ANOVA Example

To see a complete example of how one-way repeated measures ANOVA can be conducted in R, please download the one-way repeated measures ANOVA example (.txt) file.

24 comments:

  1. Thank you for doing this tutorial series. I appreciate the time and effort you have spent on this project.

    I know that you have covered 2 way ANOVA. However, would you consider doing a tutorial on factorial ANOVA (possibly as a bridge to Design of Experiments)? That in itself is a big topic, but it would also be helpful to see how to handle center points (and possibly lack of fit).


    I encourage you to add to the series, whatever the topics. The practical and concise mannerin which you provide the information is very helpful.

    ReplyDelete
  2. Thanks for your comments and I am glad to help. It sounds like you might be interested in the Two-Way ANOVA with Interactions tutorial (available from the Topics menu under the ANOVA heading), which demonstrates the investigation of simple main effects subsequent to identifying an interaction between the main effects.

    I appreciate the topic suggestions. I tend to refrain from covering statistical methods/issues themselves, since I am not formally trained as a statistician. Instead, I focus on applications using R. I'm also limited to writing about the techniques that I have learned and practiced, though I do welcome guest posts.

    ReplyDelete
  3. Your tutorial is excellent. I was able to follow it easily and quickly analyze a data set I've been working with for a long time. I tried applying the same steps to another data set but when I tried to use the Anova(mod, idata, idesign) function I got the following error message:

    Error in linearHypothesis.mlm(mod, hyp.matrix, SSPE = SSPE, idata = idata, :
    The error SSP matrix is apparently of deficient rank = 3 < 4

    Do you have any idea what this means or how to deal with it. Thanks a lot!

    ReplyDelete
  4. Thanks for the comments. I am familiar with this error. In short, it has to do with a combination of a lack of degrees of freedom to execute the multivariate tests (i.e. small sample size compared to variables) and the inability of the Anova() function to ignore/forgo calculating the multivariate tests. See this R listserv discussion for details: http://r.789695.n4.nabble.com/Anova-in-car-SSPE-apparently-deficient-rank-tp997619p997619.html.

    An alternative, which will get you the Greenhouse-Geisser and Hyunh-Feldt epsilon corrections, but no multivariate tests, is to use the anova() function.

    anova(ageModel, idata = ageFrame, X = ~ageFactor, test = "Spherical")

    One caveat, I believe, is that this will use Type I SS, whereas my Anova() example uses Type III SS. I'm not sure how to get Type III SS with the anova() function.

    ReplyDelete
  5. I'm currently using R for my statistics project, I received an error message when I tried to insert the Anova(mod, idata, idesign) function. Is there any similar command that I can use in R? My project requires me to compute one-way repeated measures ANOVA and randomization/permutation test.

    ReplyDelete
  6. Hi, I don't have enough details about your problem to help specifically, but the preceding comments on this post cover the extent of errors and potential solutions that we have come across to date.

    ReplyDelete
  7. Hi, first of all, I'm not really sure if I should use One-Way Repeated Measures or Two-Way Repeated Measures. Basically, I'm interested in studying the effect of caffeine on sleep patterns/hours of sleep among students. My sample consists of 10 students. I would randomly divide them into 2 groups:
    Group1: One has their sleep times recorded after taking caffeinated drink
    Group2: One has their sleep times recorded after taking non-caffeinated drink.
    Then they are asked to keep a daily,1-week diary of their sleep times. It means each student will have their sleep times recorded on a daily basis(Day1, Day2, etc) for one week. This assignment requires me to use R

    ReplyDelete
  8. Hi. Your sample size is pretty small compared to the number of measurements you are taking, which may be what is causing the error. If you see the post immediately preceding your original question, there is an alternative function provided for conducting the analysis. For help with general statistics or your homework, I recommend consulting your instructor and fellow students.

    ReplyDelete
  9. Your tutorial series are very helpful! I've tried using One-Way Repeated Measures ANOVA on a data set and it worked. I've also referred to the complete example right at the end of this topic. Do you know where I can get the full explanation for the R-code?

    ReplyDelete
  10. Thanks. I do comment every line of code in the text files to give an idea of what is happening, but if you want the technical documentation for each function, you can use the materials listed under Documentation at http://www.r-project.org/. Usually, if you search Google for a particular R function, you will find its formal documentation early in the results.

    ReplyDelete
  11. "Subsequently, we could conduct pairwise comparisons in the same manner as demonstrated in the One-Way ANOVA with Comparisons tutorial."

    This statement was for me a bit misleading. Pairwise test presented in the Comparison tutorial is not doing _paired_ t-test, which would be relevant for repeated measures data. The change required in the pairwise example to work with paired data is to "paired=T" as an additional parameter:
    pairwise.t.test(dataOneWayComparisons$StressReduction, dataOneWayComparisons$Treatment, p.adj = "none", "paired=T")

    ReplyDelete
  12. Hi! Your tutorial is really helpful! But I still have some questions!I conducted an experiment in which 25 men and 25 women listened to an attractive conversation and picked a photo (between a woman with red and a woman with green shirt) and next, they heard a neutral dialogue and did exactly the same, picked a photo between an woman in red and a woman in green. My hypothesis is that men are much more attracted to women in red in contrast to women. I was thinking of using repeated measures ANOVA as both men and women were 'examined' in the same experimental conditions. So, I reckon that my columns are: gender 2 levels (0 for males and 1 for females), attraction and non_attraction and I'll put in their rows 0 for not red and 1 for red! But I really don't know how I apply what I'm thinking to anova! and if that's the right way to show that each participant did this twice (two dialogues and 4 photos in total each of them). Can you help me???

    ReplyDelete
  13. From what you described, it does sound like you have the right data setup. You could use a subject, attractive, and not attractive column. See the first image in this tutorial. Your setup would look very similar, but with different headings and values.

    From your hypothesis, it sounds like you also want to compare the results by gender. In that case, you could run two separate one-way repeated measures ANOVAs, one for males and one for females.

    ReplyDelete
  14. Thank you for this tutorial. Could you perhaps just explain the output a little bit more

    ReplyDelete
  15. Hi thanks for the tutorial on One way Repeated measure ANOVA. I followed the steps exactly as you said but R is showing me an warning message like this"> analysis <- aov(ageModel, idata = ageFrame, idesign = ~ageFactor)

    Warning message:
    In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
    extra arguments idataidesign are just disregarded." how can i solve this issue.

    ReplyDelete
  16. Dear John M. Quick,
    Could you give me a piece of advice to guide me to do the Repeated Measure ANOVA when the repeated factor is a quantitave factor?
    Thank you a lot.

    ReplyDelete
    Replies
    1. Hi Gabriel,

      I'm not familiar with that variety of repeated measures ANOVA and recommend that you consult professional sources regarding the implementation of statistical methods.

      John

      Delete
  17. What is the difference between your method verses the below method which is shown for repeated measures as well in many r books(In the below case you have to reorganize your data similar to the typical anova)?
    > test<-aov(Interest~Age+Error(Subject/Age),data=data)
    > summary(test)

    Error: Subject
    Df Sum Sq Mean Sq F value Pr(>F)
    Residuals 1 5.411 5.411

    Error: Subject:Age
    Df Sum Sq Mean Sq
    Age 2 68.23 34.12

    Error: Within
    Df Sum Sq Mean Sq F value Pr(>F)
    Age 2 24.65 12.323 18.59 2.06e-07 ***
    Residuals 84 55.67 0.663
    ---

    ReplyDelete
    Replies
    1. I have seen this method used in other tutorials. The big difference that I notice is that the error term has to be manually specified, which introduces room for error and likely requires more expertise on the user's part. Also, I'm not certain that as rich of an output can be easily achieved with this method (e.g. the multivariate, univariate, and sphericity tests are all included in the tutorial output). Otherwise, I do not see any reasons why the results would differ for equivalent analyses.

      Delete
  18. Hello John Quick,
    I hope you are still here. By following your steps I invariably get this error message after step 6 ('analysis'):
    "model has only an intercept; equivalent type-III tests substituted". The analysis works when I use for 'ageModel' the function lm(ageBind ~ Subject). ('Subject' from the attached dataset). But the summary is then different. Please indicate my error!
    Thank you
    Luis

    ReplyDelete
    Replies
    1. Hi Luis. This is a note, not an error, and it also shows up every time I use the multivariate tests. I have not seen any reason that it cannot be ignored.

      Delete
  19. Hi John,

    Thanks for the post. This is pretty informative. I have a question regarding a one way repeated measures anova (where the replicate was measured 2 times). I ran your version of R code on my data and came away with a very different (shorter) output. I've read that repeated measures should only be used for replicates measured 3 or more times have also read that RMANOVA can be used for replicates that have been measured 2 times (which is my case). Any idea on how to model the latter?

    Thanks so much,
    C

    ReplyDelete
  20. Thanks for the post, it was really helpful!
    One question: do you need more subjects than factors to do a RM ANOVA? I keep getting this error when I try to run the ANOVA with less subjects than factors.

    Error in eigen(qr.coef(SSPE.qr, x$SSPH), symmetric = FALSE) :
    infinite or missing values in 'x'

    I don't think that this is the case in SPSS

    Cheers,
    Claire

    ReplyDelete
  21. Hi,

    Wonderful post here! Thanks so much for all of the tips and careful explanations.

    How can I tell if this model handles missing data points? If a subject is missing a measurement for one of the factors is the subject's data excluded from the repeated-measures analysis?

    Thanks,
    James

    ReplyDelete