Solved – Predicted values from gbm.fit and gbm differ

My intuition is that the fitted values and predicted values of a gbm object should be identical. But in this example with just one tree, the values are different:

b <- c(0,0,.8,0,0) x <- mvrnorm(100,mu=rep(0,5),diag(5)) colnames(x) <- paste0("x",1:5) y <- x %*% b + rnorm(10)  gbm.fit.out <- gbm.fit(y=y,x=x,shrinkage=.1,     n.trees=1,distribution="gaussian",verbose=F)  d <- data.frame(y=y,x=x) gbm.out <- gbm(y~.,data=d,shrinkage=.1,n.trees=1,distribution="gaussian",trainFrac=1)  p1 <- predict(gbm.fit,out,n.trees=1) p2 <- predict(gbm.out,n.trees=1) p1-p2 

Why are they different? Does it even matter?

This seems to be peculiar to gbm.fit. Using gbm (and being sure to turn off bagging, and splitting the sample into training and test set) produces correct results.

require(MASS); require(gbm) b <- c(0,0,.8,0,0) x <- mvrnorm(100,mu=rep(0,5),diag(5)) colnames(x) <- paste0("x",1:5) y <- x %*% b + rnorm(100)  out <-gbm(y~x1+x2+x3+x4+x5,data=data.frame(y,x),  shrinkage=1,n.trees=1,  distribution="gaussian",  verbose=F,bag.fraction=1,train.fraction=1)  f <- out$fit p <- predict(out,n.trees=1) all(f-p == 0) 

Similar Posts:

Rate this post

Leave a Comment