I am trying to make an interaction plot for this set of data,

but I want to make it with ggplot2. I am trying to predict VP using the predictors G and P, (these are columns in the dataset which I have found to interact with each other, and I have found to have significant impact on VP).

The syntax for ggplot2 has been difficult for me to understand so I am wondering if anyone here finds it easy and is able to create one with my data and show me the code. The last time I tried to teach myself how to make plots with user-defined functions I spent 6 hours debugging. I am hoping someone can simply walk me through the steps so I don't have to go through that again. The model would be `VP~G+P+G*P`

in R.

I am hoping to make something like the interaction boxplot found in the answer of this post.

Notes:

When creating my regression model I had several variables, but I found that G and P had the only significant interaction. So I am trying to create an interaction plot to further dissect the data. Any opinions on the efficacy and logicality of this? Also, opinions on the extremely poor fit of the interaction plot?

What should I do in this case of a poor fit? Is it safer to say that there is no pattern?

For reference, I am trying to predict the vote percentage a candidate gets, VP, based on varying the inflation rate, P, and varying the rate of growth, G. Here is my interaction plot:

Note2:

I made my interaction plot with the user defined function found here. The plot they made fit their data well, but my plot doesn't fit my data well. In addition, my plot of the residuals looks almost identical to the interaction plot. On the MIT page, the residual plot vs the interaction plot are very different, plus is easy to see a pattern in.

**Contents**hide

#### Best Answer

The original suggestion for displaying an interaction via box-plot does not quite make sense in this instance, since both of your variables that define the interaction are continuous. You could dichotomize either `G`

or `P`

, but you do not have much data to work with. Because of this, I would suggest coplots (a description of what they are can be found in;

- Cleveland, William. 1994. Coplots, nonparametric regression, and conditionally parametric fits. IMS Lecture Notes Monograph Series 24: 21-36. PDF available in link from Project Euclid.

Below is a coplot of the `election2012`

data generated by the code `coplot(VP ~ P | G, data = election2012)`

. So this is assessing the effect of `P`

on `VP`

conditional on varying values of `G`

.

Although your description makes it sound like this is a fishing expedition, we may entertain the possibility that an interaction between these two variables exist. The coplot seems to show that for lower values of `G`

the effect of `P`

is positive, and for higher values of `G`

the effect of `P`

is negative. After assessing marginal histograms and bivariate scatterplots of `VP, P, G`

and the interaction between `P`

and `G`

, it seemed to me that 1932 was likely a high leverage value for the interaction effect.

Below are four scatterplots, showing the marginal relationships between `VP`

and the mean centered `V`

, `G`

and the interaction of `V`

and `G`

(what I named `int_gpcent`

). I have highlighted 1932 as a red dot. The last plot on the lower right is the residuals of the linear model `lm(VP ~ g_cent + p_cent, data = election2012)`

against `int_gpcent`

.

Below I provide code that shows when removing 1932 from the linear model `lm(VP ~ g_cent + p_cent + int_gpcent, data = election2012)`

the interaction of `G`

and `P`

fail to reach statistical significance. Of course this is all just exploratory (one would also want to assess if any temporal correlation occurs in the series, but hopefully this is a good start. Save ggplot for when you have a better idea of what you exactly want to plot!

` #data and directory stuff mydir <- "C:\Documents and Settings\andrew.wheeler\Desktop\R_interaction" setwd(mydir) election2012 <- read.table("election2012.txt", header=T, quote=""") #making interaction variable election2012$g_cent <- election2012$G - mean(election2012$G) election2012$p_cent <- election2012$P - mean(election2012$P) election2012$int_gpcent <- election2012$g_cent * election2012$p_cent summary(election2012) View(election2012) par(mfrow= c(2, 2)) hist(election2012$VP) hist(election2012$G) hist(election2012$P) hist(election2012$int_gpcent) #scatterplot & correlation matrix cor(election2012[c("VP", "g_cent", "p_cent", "int_gpcent")]) pairs(election2012[c("VP", "g_cent", "p_cent", "int_gpcent")]) #lets just check out a coplot for interactions #coplot(VP ~ G | P, data = election2012) coplot(VP ~ P | G, data = election2012) #example of coplot - http://stackoverflow.com/questions/5857726/how-to-delete-the-given-in-a-coplot-using-r #onto models model1 <- lm(VP ~ g_cent + p_cent, data = election2012) summary(model1) election2012$resid_m1 <- residuals(model1) election2012$color <- "black" election2012$color[14] <- "red" attach(election2012) par(mfrow = c(2,2)) plot(x = g_cent,y = VP, col = color, pch = 16) plot(x = p_cent,y = VP, col = color, pch = 16) plot(x = int_gpcent,y = VP, col = color, pch = 16) plot(x = int_gpcent,y = resid_m1, col = color, pch = 16) #what does the same model look like with 1932 removed model1_int <- lm(VP ~ g_cent + p_cent + int_gpcent, data = election2012) summary(model1_int) model2_int <- lm(VP ~ g_cent + p_cent + int_gpcent, data = election2012[-14,]) summary(model2) `

### Similar Posts:

- Solved – 2x2x5 repeated measures ANOVA: significant 3-way interaction
- Solved – 2x2x5 repeated measures ANOVA: significant 3-way interaction
- Solved – how to plot interactions with continuous and categorical predictors in mixed models
- Solved – How to visualize a fitted multiple regression model
- Solved – How to deal with differences in subgroup analysis but no significant interaction