I have a simple data set for which I applied a simple linear regression model. Now I would like to use fixed effects to make a better prediction on the model. I know that I could also consider making dummy variables, but in reality is my data over several years and has more variables so I would like to avoid making dummies.
My data and code is similar to this:
data <- read.table(header = TRUE, stringsAsFactors = FALSE, text="CompanyNumber ResponseVariable Year ExplanatoryVariable1 ExplanatoryVariable2 1 2.5 2000 1 2 1 4 2001 3 1 1 3 2002 5 7 2 1 2000 3 2 2 2.4 2001 0 4 2 6 2002 2 9 3 10 2000 8 3") library(lfe) fe <- getfe(felm(data = data, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 | Year)) fe lm.1<-lm(ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2, data=data) prediction<- predict(lm.1, data) prediction check_model=postResample(pred = prediction, obs = data$ResponseVariable) check_model
For my real dataset I will make a prediction based on my test set but for simplicity I just use the trainingset here as well.
I would like to make a prediction with the help of the fixed effects that I found. But it does not seem to match the fixed effect right, anyone who knows how to use this
prediction_fe<- predict(lm.1, data) + fe$effect
You have to add the estimated fixed effects to the prediciton data frame.
library(lfe) ##data d <- read.table(header = TRUE, stringsAsFactors = FALSE, text="CompanyNumber ResponseVariable Year ExplanatoryVariable1 ExplanatoryVariable2 1 2.5 2000 1 2 1 4 2001 3 1 1 3 2002 5 7 2 1 2000 3 2 2 2.4 2001 0 4 2 6 2002 2 9 3 10 2000 8 3") ##regression e<-felm(data = d, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 | Year) ##fixed effects data d.fe<-getfe(e) ##prediction sample p<-d #could be a different sample, but with the same covariates #add columns on fixed effects p<-merge(p,d.fe[d.fe$fe=="Year",],by.x="Year",by.y="idx",all.x=T) names(p)[grep("^effect$",names(p))]<-"effect.Year" # if you have more than one fixed effect, # you should continue here, adapting the two lines above. eg. fixed effects on ComanyNumber #reorder p<-p[order(p$CompanyNumber,p$Year),] ##predict #coefficients: predicted.values<- as.matrix(p[,rownames(e$coefficients)]) %*% (e$coefficients) + # covariates * coefficients p$effect.Year # fixed effects from years ##test round(predicted.values + e$residuals- p$ResponseVariable,6) # only works if the order of all observerations conincide
Note, that the data object name is now
data, to avoide confusion.
- Solved – Predict longitudinal data with machine learning in R
- Solved – Using R and plm to estimate fixed-effects models that include interactions with time
- Solved – the difference between region, year and region-year fixed effects
- Solved – Connection between time dummies and time fixed effects
- Solved – R: Predicting future panel data