**The short version:**

I can fit a model using Weighted Least Squares, given a diagonal matrix of weights $W$, by solving $(X^TWX)hat{beta}=X^TWy$ for $hat{beta}$.

~~Is there a GLM analogue? if so, what is it?~~

There seems to be a GLM analogue, e.g. with the `weights`

argument in R's `glm`

function. How is R using these weights?

**The long version:**

### the situation

As a follow-up to my IPTW question, I just want to double check that I understand how to fit a parametric model using inverse probability(-of-treatment) weights (IPTW). The idea with IPTW is to simulate a dataset in which the relationship between my independent variables $(a^1,a^2,a^3)$ and dependent variable $y$ is unconfounded and therefore causal. For argument's sake let's say I already estimated an IPT weight $hat{w}_i$ for each observation. These weights are hypothetical probability weights from the simulated dataset.

### the question

I now want to fit a GLM. I'd just use WLS, but I'm working with a binary outcome and an outcome truncated at zero. So I have a linear model $eta_i=a^Tbeta$, a link $mu_i=g(eta_i)$, and a variance $V(y_i)$ derived from my likelihood for $y$. Then the likelihood equations are

$$

sum_{i=1}^N frac{y_i-mu_i}{V(y_i)}frac{partialmu_i}{partialbeta_j}=sum_{i=1}^N frac{y_i-mu_i}{V(y_i)}left(frac{partialmu_i}{partialeta_i}x_{ij}right)=0,~forall j

$$ as per *Categorical Data Analysis*, Agresti, 2013, section 4.4.5.

So all I have to do is multiply $var(mu_i)$ by the weight $hat{w}_i$, right? The same way I might if I wanted to incorporate an overdispersion parameter? If so, is this because the variance of, say, 5 independent observations is 5 times the variance of one independent observation?

**Follow-up idea:** since the likelihood is the product of the likelihood for each observation, is there some weighting procedure I can use to just weight the likelihoods?

#### Best Answer

Fit an MLE by maximizing $$ l(mathbf{theta};mathbf{y})=sum_{i=1}^Nl{left(theta;y_iright)} $$

where $l$ is the log-likelihood. Fitting an MLE with inverse-probability (i.e. frequency) weights entails modifying the log-likelihood to:

$$ l(mathbf{theta};mathbf{y})=sum_{i=1}^Nw_i~l{left(theta;y_iright)}. $$

In the GLM case, this reduces to solving $$ sum_{i=1}^N w_ifrac{y_i-mu_i}{V(y_i)}left(frac{partialmu_i}{partialeta_i}x_{ij}right)=0,~forall j $$

Source: page 119 of http://www.ssicentral.com/lisrel/techdocs/sglim.pdf, linked at http://www.ssicentral.com/lisrel/resources.html#t. It's the "Generalized Linear Modeling" chapter (chapter 3) of the LISREL "technical documents."