I'm having problems with
powerTransform – for example it has just failed to transform a perfectly ordinary (to me) looking variable, with the following error
Error in optim(start, llik, hessian = TRUE, method = method, ...) : L-BFGS-B needs finite values of 'fn'
Here is the data: http://www.tropic.org.uk/~crispin/boxcoxerror
> require(car) > data = read.table("boxcoxerror") > mean(data) V1 39401.55 > sd(data) V1 5381.04 > powerTransform(data$V1) Error in optim(start, llik, hessian = TRUE, method = method, ...) : L-BFGS-B needs finite values of 'fn'
I'm not really sure what's so funny about this data — it doesn't look odd in any particular ways (the coefficient of variation is reasonable, there's nothing glaringly weird in the histogram …) However, here are some possible routes forward.
require(car) bcdata = unlist(read.table(url("https://www.tropic.org.uk/~crispin/boxcoxerror"))) mean(bcdata) sd(bcdata) hist(bcdata,freq=FALSE) lines(density(bcdata))
Reproduce the error (R 2.14.1, 32-bit Linux)
L-BFGS-B, the optimizer used internally, is notoriously sensitive to scaling issues. This appears to work:
boxcox gives quite different answers based on scaling too:
m <- MASS::boxcox(lm(bcdata~1),lambda=seq(-4,2,by=0.05)) m2 <- MASS::boxcox(lm(bcdata/1000~1),lambda=seq(-8,2,by=0.05)) m2$x[which.max(m2$y)] ## agrees pretty well with powerTransform()
The other thing that worries me is that the power transformation being suggested is so extreme — do we really need a power of nearly -5 to normalize these data?
Perhaps someone who's thinking more carefully about the actual analytical details of the Box-Cox/power transformations can explain what's happening here.