# Solved – What exactly does the ‘boxcox’ function in R do

I am familiar with the power transform family and I know how to estimate the MLE for \$lambda\$ for given samples of a random variable. I have been using the 'boxcox' function in R for a sample of a random variable that depends linearly on a latent variable (i.e. a linear model) and it works very well. The problem is that I don't quite know how it works on such linear models.

The official reference won't give clear clues on how to answer my question. Can you provide some information on the answer to my question?

Contents

``library('MASS') getAnywhere('boxcox.default') ``
The result is the following (see also `getAnywhere('boxcox.lm')`):
``A single object matching ‘boxcox.default’ was found It was found in the following places   registered S3 method for boxcox from namespace MASS   namespace:MASS with value  function (object, lambda = seq(-2, 2, 1/10), plotit = TRUE, interp = (plotit &&      (m < 100)), eps = 1/50, xlab = expression(lambda), ylab = "log-Likelihood",      ...)  {     if (is.null(y <- object\$y) || is.null(xqr <- object\$qr))          stop(gettextf("%s does not have both 'qr' and 'y' components",              sQuote(deparse(substitute(object)))), domain = NA)     if (any(y <= 0))          stop("response variable must be positive")     n <- length(y)     y <- y/exp(mean(log(y)))     logy <- log(y)     xl <- loglik <- as.vector(lambda)     m <- length(xl)     for (i in 1L:m) {         if (abs(la <- xl[i]) > eps)              yt <- (y^la - 1)/la         else yt <- logy * (1 + (la * logy)/2 * (1 + (la * logy)/3 *              (1 + (la * logy)/4)))         loglik[i] <- -n/2 * log(sum(qr.resid(xqr, yt)^2))     }     if (interp) {         sp <- spline(xl, loglik, n = 100)         xl <- sp\$x             loglik <- sp\$y         m <- length(xl)     }     if (plotit) {         mx <- (1L:m)[loglik == max(loglik)][1L]         Lmax <- loglik[mx]         lim <- Lmax - qchisq(19/20, 1)/2         dev.hold()         on.exit(dev.flush())         plot(xl, loglik, xlab = xlab, ylab = ylab, type = "l",              ylim = range(loglik, lim))         plims <- par("usr")         abline(h = lim, lty = 3)         y0 <- plims[3L]         scal <- (1/10 * (plims[4L] - y0))/par("pin")[2L]         scx <- (1/10 * (plims[2L] - plims[1L]))/par("pin")[1L]         text(xl[1L] + scx, lim + scal, " 95%", xpd = TRUE)         la <- xl[mx]         if (mx > 1 && mx < m)              segments(la, y0, la, Lmax, lty = 3)         ind <- range((1L:m)[loglik > lim])         if (loglik[1L] < lim) {             i <- ind[1L]             x <- xl[i - 1] + ((lim - loglik[i - 1]) * (xl[i] -                  xl[i - 1]))/(loglik[i] - loglik[i - 1])             segments(x, y0, x, lim, lty = 3)         }         if (loglik[m] < lim) {             i <- ind[2L] + 1             x <- xl[i - 1] + ((lim - loglik[i - 1]) * (xl[i] -                  xl[i - 1]))/(loglik[i] - loglik[i - 1])             segments(x, y0, x, lim, lty = 3)         }     }     list(x = xl, y = loglik) } <bytecode: 0x7ff6e0973120> <environment: namespace:MASS> ``