Solved – R powerTransform fails on seemingly okay data

I'm having problems with powerTransform – for example it has just failed to transform a perfectly ordinary (to me) looking variable, with the following error

Error in optim(start, llik, hessian = TRUE, method = method, ...) :   L-BFGS-B needs finite values of 'fn' 

Here is the data: http://www.tropic.org.uk/~crispin/boxcoxerror

> require(car) > data = read.table("boxcoxerror") > mean(data)       V1 39401.55 > sd(data)      V1 5381.04 > powerTransform(data$V1) Error in optim(start, llik, hessian = TRUE, method = method, ...) :   L-BFGS-B needs finite values of 'fn' 

Any hints?

I'm not really sure what's so funny about this data — it doesn't look odd in any particular ways (the coefficient of variation is reasonable, there's nothing glaringly weird in the histogram …) However, here are some possible routes forward.

require(car) bcdata = unlist(read.table(url("https://www.tropic.org.uk/~crispin/boxcoxerror"))) mean(bcdata) sd(bcdata) hist(bcdata,freq=FALSE) lines(density(bcdata)) 

Reproduce the error (R 2.14.1, 32-bit Linux)

powerTransform(bcdata) 

L-BFGS-B, the optimizer used internally, is notoriously sensitive to scaling issues. This appears to work:

powerTransform(bcdata/1000) 

Oddly enough, boxcox gives quite different answers based on scaling too:

m <- MASS::boxcox(lm(bcdata~1),lambda=seq(-4,2,by=0.05)) m2 <- MASS::boxcox(lm(bcdata/1000~1),lambda=seq(-8,2,by=0.05)) m2$x[which.max(m2$y)]  ## agrees pretty well with powerTransform() 

The other thing that worries me is that the power transformation being suggested is so extreme — do we really need a power of nearly -5 to normalize these data?

Perhaps someone who's thinking more carefully about the actual analytical details of the Box-Cox/power transformations can explain what's happening here.

Similar Posts:

Rate this post

Leave a Comment