Before we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. This dataset contains variables for the following information related to ice cream consumption.
- DATE: Time period (1-30)
- CONSUME: Ice cream consumption in pints per capita
- PRICE: Per pint price of ice cream in dollars
- INC: Weekly family income in dollars
- TEMP: Mean temperature in degrees F
Planning The ModelSuppose that our research question is "how much of the variance in ice cream consumption can be predicted by per pint price, weekly family income, mean temperature, and the interaction between per pint price and weekly family income?" The italicized interaction term is the new addition to our typical multiple regression modeling procedure. This variable is relatively simple to incorporate, but it does require a few preparations.
Creating The Interaction VariableA two step process can be followed to create an interaction variable in R. First, the input variables must be centered to mitigate multicollinearity. Second, these variables must be multiplied to create the interaction variable.
Step 1: CenteringTo center a variable, simply subtract its mean from each data point and save the result into a new R variable, as demonstrated below.
- > #center the input variables
- > PRICEc <- PRICE - mean(PRICE)
- > INCc <- INC - mean(INC)
Step 2: MultiplicationOnce the input variables have been centered, the interaction term can be created. Since an interaction is formed by the product of two or more predictors, we can simply multiply our centered terms from step one and save the result into a new R variable, as demonstrated below.
- > #create the interaction variable
- > PRICEINCi <- PRICEc * INCc
Creating The ModelNow we have all of the pieces necessary to assemble our complete interaction model.
A summary of our interaction model is displayed below.
- > #create the interaction model using lm(FORMULA, DATAVAR)
- > #predict ice cream consumption by its per pint price, weekly family income, mean temperature, and the interaction between per pint price and weekly family income
- > interactionModel <- lm(CONSUME ~ PRICE + INC + TEMP + PRICEINCi, datavar)
- > #display summary information about the model
- > summary(interactionModel)