I would like to construct a confidence interval around prediction from a neural network, without resorting to bootstrapping – given the computational cost. Can I use the Hessian returned in this way to produce a 95% CI?
1) Can you take the negative inverse of the Hessian as the var/covar matrix? I read here that this depends on what is being maximized or minimized. Is this true and how can you know for certain?
2) Is this an accepted routine for producing a confidence interval around a prediction and if so, how would you do it?
Best Answer
Try the nnetpredint package. https://cran.r-project.org/web/packages/nnetpredint/
I’ve met the same problem and I also want to construct a prediction confidence interval to the neural networks. So I tried to develop the nnetpredint (R package), using the method from these related papers, which use the Jacobian matrix (first order derivative of the training datasets with gradient function) to estimate model errors instead of the Hessian matrix. The manual is here and the method has the function interface to the models trained by nnet, neuralnet and RSNNS packages:
The example for nnet package is here. The method nnetPredInt takes the model weights, nodes number, training datasets, etc. as input and compute the prediction interval for the new datasets.
install.packages("nnetpredint") # Example: Using the nnet object trained by nnet package library(nnet) xTrain <- rbind(cbind(runif(150,min = 0, max = 0.5),runif(150,min = 0, max = 0.5)) , cbind(runif(150,min = 0.5, max = 1),runif(150,min = 0.5, max = 1)) ) nObs <- dim(xTrain)[1] yTrain <- 0.5 + 0.4 * sin(2* pi * xTrain %*% c(0.4,0.6)) +rnorm(nObs,mean = 0, sd = 0.05) plot(xTrain %*% c(0.4,0.6),yTrain) # Training nnet models net <- nnet(yTrain ~ xTrain,size = 3, rang = 0.1,decay = 5e-4, maxit = 500) yFit <- c(net$fitted.values) nodeNum <- c(2,3,1) wts <- net$wts # New data for prediction intervals library(nnetpredint) newData <- cbind(seq(0,1,0.05),seq(0,1,0.05)) yTest <- 0.5 + 0.4 * sin(2* pi * newData %*% c(0.4,0.6))+rnorm(dim(newData)[1],mean = 0, sd = 0.05) # S3 generic method: Object of nnet yPredInt <- nnetPredInt(net, xTrain, yTrain, newData, alpha = 0.05) # 95% confidence interval print(yPredInt[1:20,]) # S3 default method for user defined input yPredInt2 <- nnetPredInt(object = NULL, xTrain, yTrain, yFit, node = nodeNum, wts = wts, newData, alpha = 0.05, funName = 'sigmoid') plot(newData %*% c(0.4,0.6),yTest,type = 'b') lines(newData %*% c(0.4,0.6),yPredInt$yPredValue,type = 'b',col='blue') lines(newData %*% c(0.4,0.6),yPredInt$lowerBound,type = 'b',col='red') # lower bound lines(newData %*% c(0.4,0.6),yPredInt$upperBound,type = 'b',col='red') # upper bound
The keys to the estimation methods:
Use the first order Taylor expansion to expand the f(x) at each weight parameters. And calculate the gradient vector/ Jacobian matrix from the training datasets.
References:
De Veaux R. D., Schumi J., Schweinsberg J., Ungar L. H., 1998, "Prediction intervals for neural networks via nonlinear regression", Technometrics 40(4): 273-282.
Chryssolouris G., Lee M., Ramsey A., "Confidence interval prediction for neural networks models",IEEE Trans. Neural Networks, 7 (1), 1996, pp. 229-232
And also check out this paper for detailed maths. http://cdn.intechopen.com/pdfs-wm/14915.pdf Confidence Intervals for Neural Networks and Applications to Modeling Engineering Materials
Similar Posts:
- Solved – Approximation of a quadratic function with neural networks
- Solved – Issues with sequential feature selection
- Solved – What are the best R packages for a classification problem with use of Neural networks
- Solved – Example of time series prediction using neural networks in R
- Solved – which neural network to use