## Solved – Student t-distribution parameter/s and MLE

So I always thought of the Student t-distribution as having only 1 parameter, v, the degrees of freedom (as described by wikipedia). When I searched however on how to find the MLE of v I keep coming across questions mentioning mu and sigma as parameters as well. So 1) Is the Student-t Distribution a special … Read more

## Solved – Maximum likelihood estimation under heteroskedasticity (and relation to OLS)

I have a question about MLE and how it relates to OLS. I know how to relate OLS and MLE when the noise is normal and homoskedastic. I can apply the same reason for heteroskedastic noise. My question is that, clearly, the noise terms are no longer identical (though still independent). So, we can apply … Read more

## Solved – Taylor series expansion of maximum likelihood estimator, Newton-Raphson, Fisher scoring and distribution of MLE by Delta method

Assume \$ellleft(thetaright)\$ is the log-likelihood of parameter vector \$theta\$ and \$widehat{theta}\$ is the maximum likelihood estimator of \$theta\$ then the Taylor series of \$ellleft(thetaright)\$ about \$widehat{theta}\$ is begin{align*} ellleft(thetaright) & approxeqellleft(widehat{theta}right)+frac{partialellleft(thetaright)}{partialtheta}Bigr|_{theta=widehat{theta}}left(theta-widehat{theta}right)+frac{1}{2}left(theta-widehat{theta}right)^{prime}frac{partial^{2}ellleft(thetaright)}{partialthetapartialtheta^{prime}}Bigr|_{theta=widehat{theta}}left(theta-widehat{theta}right)\ frac{ellleft(thetaright)}{partialtheta} & approxeqmathbf{0}+left(mathbf{1}-mathbf{0}right)frac{partialellleft(thetaright)}{partialtheta}Bigr|_{theta=widehat{theta}}+frac{partial^{2}ellleft(thetaright)}{partialthetapartialtheta^{prime}}Bigr|_{theta=widehat{theta}}left(theta-widehat{theta}right)quadoverset{textrm{set}}{=}quadmathbf{0}\ \ theta-widehat{theta} & =-left[frac{partial^{2}ellleft(thetaright)}{partialthetapartialtheta^{prime}}Bigr|_{theta=widehat{theta}}right]^{-}left[frac{partialellleft(thetaright)}{partialtheta}Bigr|_{theta=widehat{theta}}right]\ widehat{theta}-theta & =left[frac{partial^{2}ellleft(thetaright)}{partialthetapartialtheta^{prime}}Bigr|_{theta=widehat{theta}}right]^{-}left[frac{partialellleft(thetaright)}{partialtheta}Bigr|_{theta=widehat{theta}}right]\ widehat{theta}-theta & =left[mathbb{H}left(thetaright)Bigr|_{theta=widehat{theta}}right]^{-}left[mathbb{S}left(thetaright)Bigr|_{theta=widehat{theta}}right] end{align*} As \$ theta=widehat{theta}-left[mathbb{H}left(thetaright)Bigr|_{theta=widehat{theta}}right]^{-}left[mathbb{S}left(thetaright)Bigr|_{theta=widehat{theta}}right] \$ So begin{align*} theta^{left(m+1right)} & =theta^{left(mright)}-left[mathbb{H}left(theta^{left(mright)}right)right]^{-}mathbb{S}left(theta^{left(mright)}right)quadquadleft({textrm{Newton-Raphson}}right)\ \ … Read more

## Solved – Why does the glm residual deviance have a chi-squared asymptotic null distribution

For a generalized linear model, the residual deviance is often described as having asymptotically a chi-squared null distribution. I read that it's the case, for example http://thestatsgeek.com/2014/04/26/deviance-goodness-of-fit-test-for-poisson-regression/ but I can't figure out why. Can you help with an explanation? Best Answer Your original question was rather cryptic, but I will assume that you are referring … Read more

## Solved – How to compute (or numerically estimate) the standard error of the MLE

I have a model for which I know the log likelihood function, the gradient of the log likelihood and the Hessian of the log likelihood. For given data I can compute the MLE using a generic optimizer (Nelder-Mead). How do I compute (or estimate) the standard error for the MLE? If there is existing software … Read more

## Solved – Proof of invariance property of MLE

I am reading the proof of the invariance property of MLE from Casella and Berger. In this proof we parametrize : \$eta = tau(theta)\$ There we define the induced likelihood function: \$ L_{1}^*(eta|x) = sup_{theta|tau(theta) = eta} L(theta|x) tag{1}\$ I have subscripted L*(\$eta\$|x) by 1 to differentiate between the induced likelihood of \$eta \$ and … Read more

## Solved – Proof of invariance property of MLE

I am reading the proof of the invariance property of MLE from Casella and Berger. In this proof we parametrize : \$eta = tau(theta)\$ There we define the induced likelihood function: \$ L_{1}^*(eta|x) = sup_{theta|tau(theta) = eta} L(theta|x) tag{1}\$ I have subscripted L*(\$eta\$|x) by 1 to differentiate between the induced likelihood of \$eta \$ and … Read more

## Solved – Estimation with MLE and returning the score/gradient (QMLE)

I am estimating a simple AR(1) process by the ML approach. I also wish to compute the Quasi MLE standard errors, which is given by the sandwich form of the Hessian and the Score (see for example the last slide here) So, I start by just specifying the (conditional) log likelihood for the (gaussian) AR(1) … Read more

## Solved – Estimation with MLE and returning the score/gradient (QMLE)

I am estimating a simple AR(1) process by the ML approach. I also wish to compute the Quasi MLE standard errors, which is given by the sandwich form of the Hessian and the Score (see for example the last slide here) So, I start by just specifying the (conditional) log likelihood for the (gaussian) AR(1) … Read more

## Solved – On the MLE of p in Bernoulli and Binomial distributions

Suppose we have a random variable \$X = [x_1, x_2, …, x_m]\$, that is distributed \$Binomial(n,p)\$, with known \$n\$ and unknown \$p\$. Now, assume we want to estimate \$p\$. Usually, textbooks and articles online give that the MLE of \$p\$ is \$frac{sum_{i=1}^{m}x_i}{n}\$. However, isn't it correct only when \$m=1\$, or in other words, when we … Read more