I am familiar with the power transform family and I know how to estimate the MLE for $lambda$ for given samples of a random variable. I have been using the 'boxcox' function in R for a sample of a random variable that depends linearly on a latent variable (i.e. a linear model) and it works very well. The problem is that I don't quite know how it works on such linear models.
The official reference won't give clear clues on how to answer my question. Can you provide some information on the answer to my question?
Best Answer
This is not a complete answer, but the real official reference is arguably "the code". In this case it's a little harder to find (it's a method, and it's hidden behind a namespace), but this command shows the code for the default method.
library('MASS') getAnywhere('boxcox.default')
The result is the following (see also getAnywhere('boxcox.lm')
):
A single object matching ‘boxcox.default’ was found It was found in the following places registered S3 method for boxcox from namespace MASS namespace:MASS with value function (object, lambda = seq(-2, 2, 1/10), plotit = TRUE, interp = (plotit && (m < 100)), eps = 1/50, xlab = expression(lambda), ylab = "log-Likelihood", ...) { if (is.null(y <- object$y) || is.null(xqr <- object$qr)) stop(gettextf("%s does not have both 'qr' and 'y' components", sQuote(deparse(substitute(object)))), domain = NA) if (any(y <= 0)) stop("response variable must be positive") n <- length(y) y <- y/exp(mean(log(y))) logy <- log(y) xl <- loglik <- as.vector(lambda) m <- length(xl) for (i in 1L:m) { if (abs(la <- xl[i]) > eps) yt <- (y^la - 1)/la else yt <- logy * (1 + (la * logy)/2 * (1 + (la * logy)/3 * (1 + (la * logy)/4))) loglik[i] <- -n/2 * log(sum(qr.resid(xqr, yt)^2)) } if (interp) { sp <- spline(xl, loglik, n = 100) xl <- sp$x loglik <- sp$y m <- length(xl) } if (plotit) { mx <- (1L:m)[loglik == max(loglik)][1L] Lmax <- loglik[mx] lim <- Lmax - qchisq(19/20, 1)/2 dev.hold() on.exit(dev.flush()) plot(xl, loglik, xlab = xlab, ylab = ylab, type = "l", ylim = range(loglik, lim)) plims <- par("usr") abline(h = lim, lty = 3) y0 <- plims[3L] scal <- (1/10 * (plims[4L] - y0))/par("pin")[2L] scx <- (1/10 * (plims[2L] - plims[1L]))/par("pin")[1L] text(xl[1L] + scx, lim + scal, " 95%", xpd = TRUE) la <- xl[mx] if (mx > 1 && mx < m) segments(la, y0, la, Lmax, lty = 3) ind <- range((1L:m)[loglik > lim]) if (loglik[1L] < lim) { i <- ind[1L] x <- xl[i - 1] + ((lim - loglik[i - 1]) * (xl[i] - xl[i - 1]))/(loglik[i] - loglik[i - 1]) segments(x, y0, x, lim, lty = 3) } if (loglik[m] < lim) { i <- ind[2L] + 1 x <- xl[i - 1] + ((lim - loglik[i - 1]) * (xl[i] - xl[i - 1]))/(loglik[i] - loglik[i - 1]) segments(x, y0, x, lim, lty = 3) } } list(x = xl, y = loglik) } <bytecode: 0x7ff6e0973120> <environment: namespace:MASS>
Similar Posts:
- Solved – Error in boxcox.default(y ~ x) : response variable must be positive
- Solved – Power transformation using Box-Cox transformation
- Solved – How to calculate normal score in R?
- Solved – Optimize likelihood function to get lambda for Box-Cox transform of two variables
- Solved – Time series: ets() Box Cox transformation and AICc comparation