I have a large aggregate market data set on wine sales in the U.S. and I would like to estimate the demand for certain high quality wines. These market shares were basically derived from a random utility model of the form

$$U_{ijt} = X’_{jt}beta – alpha p_{jt} + xi_{jt} + epsilon_{ijt} equiv delta_{jt} + epsilon_{jt}$$

where $X$ includes observed product characteristics, $p$ denotes product prices, $xi$ are unobserved product characteristics which influence demand and that are correlated with price and $epsilon$ is the error term, $i$ indexes individuals, $j$ indexes products and $t$ indexes markets (cities in this case).

I can’t use the usual conditional logit model because of the unobserved quality term $xi$ and I don’t have a good instrument. However, Berry (1994) developed a strategy for linearizing the nonlinear system of market equations in a multinomial logit framework but I cannot figure out how he does the inversion step.

At the true parameter values he says that the estimated market share should be equal to the “true” market share: $widehat{s}_{jt} (X, beta , alpha , xi) = S_{jt}$ for which he then suggests to invert the market shares as from

$$S_{jt} = widehat{s}_{jt}(delta , alpha , beta)$$

to

$$delta = widehat{s}^{-1}(S, alpha, beta)$$

Which allows to solve for $xi$ and eliminate it. If someone could shed light on how this inversion step works or maybe even implement it in Stata, this would be great. Many thanks.

**Contents**hide

#### Best Answer

Consider a multinomial logit model in which you estimate market shares as $$widehat{s}_{jt} = frac{exp(delta_{jt})}{1 + sum^{J}_{g=1}exp(delta_{gt})}$$ where the outside good is normalized to zero. When you take the log of this expression, you get $$log (widehat{s}_{jt}) = delta_{jt} – log left( 1 + sum^{J}_{g=1}exp(delta_{gt}) right)$$ for the inside goods and for the outside good: $$log (widehat{s}_{0t}) = 0 – log left( 1 + sum^{J}_{g=1}exp(delta_{gt}) right)$$

Then your $delta_{jt}$ is given by $$delta_{jt} = log (widehat{s}_{jt}) – log (widehat{s}_{0t}) = X'_{jt}beta – alpha p_{jt} + xi_{jt}$$ and assuming that given a large enough sample the estimated market shares equal the true market shares, as you stated. This can be estimated via OLS where the error term is given by $xi_{jt}$. Note that markets are assumed to be independent from each other.

To clarify the concept, let’s consider an example in Stata. I don’t have a suitable data set in mind for such an exercise, so let us just assume we have aggregate data on

- 5 products (
`prod`

) - product prices (
`p`

) - quantity sold (
`q`

) - two product characteristics (
`x1, x2`

)

Suppose good 1 is the outside good with a market share of 10-20% (varying by market) and the remainder being split between the other goods. What you would do in Stata is the following:

`* calculate the market share of your goods in all markets egen mktsales = sum(q), by(mkt) gen share = q/mktsales * generate logs gen ln_share = ln(share) * subtract the log share of the outside good from the log share of the inside goods gen diffshare = . forval i = 1(1)100 { qui sum ln_share if prod==1 & mkt==`i’ replace diffshare = ln_share - `r(max)’ if mkt==`i’ } * run the regression reg diffshare p x1 x2 `

And this gives you the Berry inversion or Berry logit for demand estimation. One thing to be cautious about: if the unobserved product characteristics $xi_{jt}$ include factors that are correlated with price (like quality of the product or advertising campaigns) then you need to use instrumental variables regression. You can do this because we have linearized the market demand system, hence standard 2SLS is an option.

In this case you need something which exogenously changes the price but that doesn’t affect demand. Common instruments used in the empirical industrial organizations literature in economics are cost shifters (see Berry et al., 1995) as for instance the price of fish is affected by rough weather on the sea but consumer demand won’t be; product characteristics of rival firms under the assumption that consumer valuation of good $i$ does not depend on other products’ characteristics (see Nevo, 2001) or if you have a spatial dimension to the data, Hausman (1997) uses price changes of a brand in city A to instrument prices in city B. This works given that products of a brand in both cities share common marginal costs but not the same demand.

As an alternative, Berry et al. (1995) develop a random coefficients logit model which gives more accurate own and cross-price elasticities and more flexible substitution patterns between goods.

**References:**

- Berry, S., J. Levinsohn & A. Pakes (1995), “Automobile prices in market equilibrium”, Econmetrica, 63, 4, 841-90
- Hausman, J., “Valuation of New Goods Under Perfect and Imperfect Competition,” in Bresnahan and Gordon (eds.), The Economics of New Goods, NBER Studies in Income and Wealth 58, 1997, 209-237
- Nevo, A. (2001), “Measuring market power in the ready-to-eat cereal industry”, Econometrica, 69, 2, 307-42