Tutorial Files
Before we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. This dataset contains a hypothetical sample of 30 participants whose interest in voting was measured at three different ages (10, 15, and 20). The interest values are represented on a scale that ranges from 1 to 5 and indicate how interested each participant was in voting at each given age.Data Setup
Notice that our data are arranged differently for a repeated measures ANOVA. In a typical one-way ANOVA, we would place all of the values of our independent variable in a single column and identify their respective levels with a second column, as demonstrated in this sample one-way dataset. In a repeated measures ANOVA, we instead treat each level of our independent variable as if it were a variable, thus placing them side by side as columns. Hence, rather than having one vertical column for voting interest, with a second column for age, we have three separate columns for voting interest, one for each age level.Beginning Steps
To begin, we need to read our dataset into R and store its contents in a variable.
- > #read the dataset into an R variable using the read.csv(file) function
- > dataOneWayRepeatedMeasures <- read.csv("dataset_ANOVA_OneWayRepeatedMeasures.csv")
- > #display the data
- > #notice the atypical column arrangement for repeated measures data
- > dataOneWayRepeatedMeasures
The first ten rows of our dataset
Preparing the Repeated Measures Factor
Prior to executing our analysis, we must follow a small series of steps in order to prepare our repeated measures factor.Step 1: Define the Levels
- > #use c() to create a vector containing the number of levels within the repeated measures factor
- > #create a vector numbering the levels for our three voting interest age groups
- > ageLevels <- c(1, 2, 3)
Step 2: Define the Factor
- > #use as.factor() to create a factor using the level vector from step 1
- > #convert the age levels into a factor
- > ageFactor <- as.factor(ageLevels)
Step 3: Define the Frame
- > #use data.frame() to create a data frame using the factor from step 2
- > #convert the age factor into a data frame
- > ageFrame <- data.frame(ageFactor)
Step 4: Bind the Columns
- > #use cbind() to bind the levels of the factor from the original dataset
- > #bind the age columns
- > ageBind <- cbind(dataOneWayRepeatedMeasures$Interest10, dataOneWayRepeatedMeasures$Interest15, dataOneWayRepeatedMeasures$Interest20)
Step 5: Define the Model
- > #use lm() to generate a linear model using the bound factor levels from step 4
- > #generate a linear model using the bound age levels
- > ageModel <- lm(ageBind ~ 1)
Employing the Anova(mod, idata, idesign) Function
Typically, researchers will choose one of several techniques for analyzing repeated measures data, such as an epsilon-correction method, like Huynh-Feldt or Greenhouse-Geisser, or a multivariate method, like Wilks' Lambda or Hotelling's Trace. Conveniently, having already prepared our data, we can employ a single Anova(mod, idata, idesign) function from the car package to yield all of the relevant repeated measures results. This allows us simplicity in that only a single function is required, regardless of the technique that we wish to employ. Thus, witnessing our outcomes becomes as simple as locating the desired method in the cleanly printed results.Our Anova(mod, idata, idesign) function will be composed of three arguments. First, mod will contain our linear model from Step 5 in the preceding section. Second, idata will contain our data frame from Step 3. Third, idesign will contain our factor from Step 2, preceded by a tilde (~). Thus, our final function takes on the following form.
- > #load the car package (install first, if necessary)
- library(car)
- > #compose the Anova(mod, idata, idesign) function
- > analysis <- Anova(ageModel, idata = ageFrame, idesign = ~ageFactor)
Visualizing the Results
Finally, we can use the summary(object) function to visualize the results of our repeated measures ANOVA.
- > #use summary(object) to visualize the results of the repeated measures ANOVA
- > summary(analysis)
Relevant segment of repeated measures ANOVA results
Supposing that we are interested in the Wilks' Lambda method, we can see that the differences in the means for voting interest at ages 10, 15, and 20 are statistically significant (p < .001).
Thank you for doing this tutorial series. I appreciate the time and effort you have spent on this project.
ReplyDeleteI know that you have covered 2 way ANOVA. However, would you consider doing a tutorial on factorial ANOVA (possibly as a bridge to Design of Experiments)? That in itself is a big topic, but it would also be helpful to see how to handle center points (and possibly lack of fit).
I encourage you to add to the series, whatever the topics. The practical and concise mannerin which you provide the information is very helpful.
Thanks for your comments and I am glad to help. It sounds like you might be interested in the Two-Way ANOVA with Interactions tutorial (available from the Topics menu under the ANOVA heading), which demonstrates the investigation of simple main effects subsequent to identifying an interaction between the main effects.
ReplyDeleteI appreciate the topic suggestions. I tend to refrain from covering statistical methods/issues themselves, since I am not formally trained as a statistician. Instead, I focus on applications using R. I'm also limited to writing about the techniques that I have learned and practiced, though I do welcome guest posts.
Your tutorial is excellent. I was able to follow it easily and quickly analyze a data set I've been working with for a long time. I tried applying the same steps to another data set but when I tried to use the Anova(mod, idata, idesign) function I got the following error message:
ReplyDeleteError in linearHypothesis.mlm(mod, hyp.matrix, SSPE = SSPE, idata = idata, :
The error SSP matrix is apparently of deficient rank = 3 < 4
Do you have any idea what this means or how to deal with it. Thanks a lot!
Thanks for the comments. I am familiar with this error. In short, it has to do with a combination of a lack of degrees of freedom to execute the multivariate tests (i.e. small sample size compared to variables) and the inability of the Anova() function to ignore/forgo calculating the multivariate tests. See this R listserv discussion for details: http://r.789695.n4.nabble.com/Anova-in-car-SSPE-apparently-deficient-rank-tp997619p997619.html.
ReplyDeleteAn alternative, which will get you the Greenhouse-Geisser and Hyunh-Feldt epsilon corrections, but no multivariate tests, is to use the anova() function.
anova(ageModel, idata = ageFrame, X = ~ageFactor, test = "Spherical")
One caveat, I believe, is that this will use Type I SS, whereas my Anova() example uses Type III SS. I'm not sure how to get Type III SS with the anova() function.
I'm currently using R for my statistics project, I received an error message when I tried to insert the Anova(mod, idata, idesign) function. Is there any similar command that I can use in R? My project requires me to compute one-way repeated measures ANOVA and randomization/permutation test.
ReplyDeleteHi, I don't have enough details about your problem to help specifically, but the preceding comments on this post cover the extent of errors and potential solutions that we have come across to date.
ReplyDeleteHi, first of all, I'm not really sure if I should use One-Way Repeated Measures or Two-Way Repeated Measures. Basically, I'm interested in studying the effect of caffeine on sleep patterns/hours of sleep among students. My sample consists of 10 students. I would randomly divide them into 2 groups:
ReplyDeleteGroup1: One has their sleep times recorded after taking caffeinated drink
Group2: One has their sleep times recorded after taking non-caffeinated drink.
Then they are asked to keep a daily,1-week diary of their sleep times. It means each student will have their sleep times recorded on a daily basis(Day1, Day2, etc) for one week. This assignment requires me to use R
Hi. Your sample size is pretty small compared to the number of measurements you are taking, which may be what is causing the error. If you see the post immediately preceding your original question, there is an alternative function provided for conducting the analysis. For help with general statistics or your homework, I recommend consulting your instructor and fellow students.
ReplyDeleteYour tutorial series are very helpful! I've tried using One-Way Repeated Measures ANOVA on a data set and it worked. I've also referred to the complete example right at the end of this topic. Do you know where I can get the full explanation for the R-code?
ReplyDeleteThanks. I do comment every line of code in the text files to give an idea of what is happening, but if you want the technical documentation for each function, you can use the materials listed under Documentation at http://www.r-project.org/. Usually, if you search Google for a particular R function, you will find its formal documentation early in the results.
ReplyDelete"Subsequently, we could conduct pairwise comparisons in the same manner as demonstrated in the One-Way ANOVA with Comparisons tutorial."
ReplyDeleteThis statement was for me a bit misleading. Pairwise test presented in the Comparison tutorial is not doing _paired_ t-test, which would be relevant for repeated measures data. The change required in the pairwise example to work with paired data is to "paired=T" as an additional parameter:
pairwise.t.test(dataOneWayComparisons$StressReduction, dataOneWayComparisons$Treatment, p.adj = "none", "paired=T")
Hi! Your tutorial is really helpful! But I still have some questions!I conducted an experiment in which 25 men and 25 women listened to an attractive conversation and picked a photo (between a woman with red and a woman with green shirt) and next, they heard a neutral dialogue and did exactly the same, picked a photo between an woman in red and a woman in green. My hypothesis is that men are much more attracted to women in red in contrast to women. I was thinking of using repeated measures ANOVA as both men and women were 'examined' in the same experimental conditions. So, I reckon that my columns are: gender 2 levels (0 for males and 1 for females), attraction and non_attraction and I'll put in their rows 0 for not red and 1 for red! But I really don't know how I apply what I'm thinking to anova! and if that's the right way to show that each participant did this twice (two dialogues and 4 photos in total each of them). Can you help me???
ReplyDeleteFrom what you described, it does sound like you have the right data setup. You could use a subject, attractive, and not attractive column. See the first image in this tutorial. Your setup would look very similar, but with different headings and values.
ReplyDeleteFrom your hypothesis, it sounds like you also want to compare the results by gender. In that case, you could run two separate one-way repeated measures ANOVAs, one for males and one for females.
Thank you for this tutorial. Could you perhaps just explain the output a little bit more
ReplyDeleteHi thanks for the tutorial on One way Repeated measure ANOVA. I followed the steps exactly as you said but R is showing me an warning message like this"> analysis <- aov(ageModel, idata = ageFrame, idesign = ~ageFactor)
ReplyDeleteWarning message:
In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
extra arguments idataidesign are just disregarded." how can i solve this issue.
Dear John M. Quick,
ReplyDeleteCould you give me a piece of advice to guide me to do the Repeated Measure ANOVA when the repeated factor is a quantitave factor?
Thank you a lot.
Hi Gabriel,
DeleteI'm not familiar with that variety of repeated measures ANOVA and recommend that you consult professional sources regarding the implementation of statistical methods.
John
What is the difference between your method verses the below method which is shown for repeated measures as well in many r books(In the below case you have to reorganize your data similar to the typical anova)?
ReplyDelete> test<-aov(Interest~Age+Error(Subject/Age),data=data)
> summary(test)
Error: Subject
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 1 5.411 5.411
Error: Subject:Age
Df Sum Sq Mean Sq
Age 2 68.23 34.12
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Age 2 24.65 12.323 18.59 2.06e-07 ***
Residuals 84 55.67 0.663
---
I have seen this method used in other tutorials. The big difference that I notice is that the error term has to be manually specified, which introduces room for error and likely requires more expertise on the user's part. Also, I'm not certain that as rich of an output can be easily achieved with this method (e.g. the multivariate, univariate, and sphericity tests are all included in the tutorial output). Otherwise, I do not see any reasons why the results would differ for equivalent analyses.
DeleteHello John Quick,
ReplyDeleteI hope you are still here. By following your steps I invariably get this error message after step 6 ('analysis'):
"model has only an intercept; equivalent type-III tests substituted". The analysis works when I use for 'ageModel' the function lm(ageBind ~ Subject). ('Subject' from the attached dataset). But the summary is then different. Please indicate my error!
Thank you
Luis
Hi Luis. This is a note, not an error, and it also shows up every time I use the multivariate tests. I have not seen any reason that it cannot be ignored.
DeleteHi John,
ReplyDeleteThanks for the post. This is pretty informative. I have a question regarding a one way repeated measures anova (where the replicate was measured 2 times). I ran your version of R code on my data and came away with a very different (shorter) output. I've read that repeated measures should only be used for replicates measured 3 or more times have also read that RMANOVA can be used for replicates that have been measured 2 times (which is my case). Any idea on how to model the latter?
Thanks so much,
C
Thanks for the post, it was really helpful!
ReplyDeleteOne question: do you need more subjects than factors to do a RM ANOVA? I keep getting this error when I try to run the ANOVA with less subjects than factors.
Error in eigen(qr.coef(SSPE.qr, x$SSPH), symmetric = FALSE) :
infinite or missing values in 'x'
I don't think that this is the case in SPSS
Cheers,
Claire
Hi,
ReplyDeleteWonderful post here! Thanks so much for all of the tips and careful explanations.
How can I tell if this model handles missing data points? If a subject is missing a measurement for one of the factors is the subject's data excluded from the repeated-measures analysis?
Thanks,
James