Tutorial Files
Before we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. This dataset contains variables for the following information related to ice cream consumption.
- DATE: Time period (1-30)
- CONSUME: Ice cream consumption in pints per capita
- PRICE: Per pint price of ice cream in dollars
- INC: Weekly family income in dollars
- TEMP: Mean temperature in degrees F
Planning The Model
Suppose that our research question is "how much of the variance in ice cream consumption can be predicted by per pint price, weekly family income, mean temperature, and the interaction between per pint price and weekly family income?" The italicized interaction term is the new addition to our typical multiple regression modeling procedure. This variable is relatively simple to incorporate, but it does require a few preparations.Creating The Interaction Variable
A two step process can be followed to create an interaction variable in R. First, the input variables must be centered to mitigate multicollinearity. Second, these variables must be multiplied to create the interaction variable.Step 1: Centering
To center a variable, simply subtract its mean from each data point and save the result into a new R variable, as demonstrated below.
- > #center the input variables
- > PRICEc <- PRICE - mean(PRICE)
- > INCc <- INC - mean(INC)
Step 2: Multiplication
Once the input variables have been centered, the interaction term can be created. Since an interaction is formed by the product of two or more predictors, we can simply multiply our centered terms from step one and save the result into a new R variable, as demonstrated below.
- > #create the interaction variable
- > PRICEINCi <- PRICEc * INCc
Creating The Model
Now we have all of the pieces necessary to assemble our complete interaction model.A summary of our interaction model is displayed below.
- > #create the interaction model using lm(FORMULA, DATAVAR)
- > #predict ice cream consumption by its per pint price, weekly family income, mean temperature, and the interaction between per pint price and weekly family income
- > interactionModel <- lm(CONSUME ~ PRICE + INC + TEMP + PRICEINCi, datavar)
- > #display summary information about the model
- > summary(interactionModel)
can you show an example creating an interaction term with categorical variables please
ReplyDeleteHi,
ReplyDeleteDid you find an example of an interaction variable that involves categorical variables? I wanted to know that as well.
Use the interaction function (base). Let a and b categorial variables then it would be:
ReplyDeleteinteraction(a,b, sep = ":")