# Solved – Formulating quantile regression as Linear Programming problem

How do I formulate quantile regression as a Linear Programming problem?
When looking at the median quantile problem I know it is

begin{align} text{minimize } & sum_{i=1}^n |beta_0 + X_i beta_1-Y_i|\ text{transforms into } & \ text{minimize } & sum_{i=1}^n e_i\ text{s.t.} & \ & e_igeq beta_0 + X_ibeta_{1}-Y_i\ & e_igeq -(beta_0 + X_ibeta_{1}-Y_i) end{align}
but how do I transform the minimization for other quantiles?

Contents

You use the quantile regression estimator

$$hat beta(tau) := arg min_{theta in mathbb R^K} sum_{i=1}^N rho_tau(y_i – mathbf x_i^top theta).$$

where $$tau in (0,1)$$ is constant chosen according to which quantile needs to be estimated and the function $$rho_tau(.)$$ is defined as

$$rho_tau(r) = r(tau – I(r<0)).$$

In order to see the purpose of the $$rho_tau(.)$$ consider first that it takes the residuals as arguments, when these are defined as $$epsilon_i =y_i – mathbf x_i^top theta$$. The sum in the minimization problem can therefore be rewritten as

$$sum_{i=1}^N rho_tau(epsilon_i) =sum_{i=1}^N tau lvert epsilon_i lvert I[epsilon_i geq 0] + (1-tau) lvert epsilon_i lvert I[epsilon_i < 0]$$

such that positive residuals associated with observation $$y_i$$ above the suggested quantile regression line $$mathbf x_i^top theta$$ are given the weight of $$tau$$ while negative residuals associated with observations $$y_i$$ below the suggested quantile regression line $$mathbf x_i^top theta$$ are weighted with $$(1-tau)$$.

Intuitively:

With $$tau=0.5$$ positive and negative residuals are "punished" with the same weight and an equal number of observation are above and below the "line" in optimum so the line $$mathbf x_i^top hat beta$$ is the median regression "line".

When $$tau=0.9$$ each positive residual is weighted 9 times that of a negative residual with weight $$1-tau= 0.1$$ and so in optimum for every observation above the "line" $$mathbf x_i^top hat beta$$ approximately 9 will be placed below the line. Hence the "line" represent the 0.9-quantile. (For an exact statement of this see THM. 2.2 and Corollary 2.1 in Koenker (2005) "Quantile Regression")

The two cases are illustrated in these plots. Left panel $$tau=0.5$$ and right panel $$tau=0.9$$. Linear programs are predominantly analyzed and solved using the standard form

$$(1) min_z c^top z mbox{subject to } A z = b , z geq 0$$

To arrive at a linear program on standard form the first problem is that in such a program (1) all variables over which minimization is performed $$z$$ should be positive. To achieve this residuals are decomposed into positive and negative part using slack variables:

$$epsilon_i = u_i – v_i$$

where positive part $$u_i = max(0,epsilon_i) = lvert epsilon_i lvert I[epsilon_i geq 0]$$ and $$v_i = max(0,-epsilon_i) =lvert epsilon_i lvert I[epsilon_i < 0]$$ is the negative part. The sum of residuals assigned weights by the check function is then seen to be

$$sum_{i=1}^N rho_tau(epsilon_i) = sum_{i=1}^N tau u_i + (1-tau) v_i = tau mathbf 1_N^top u + (1-tau)mathbf 1_N^top v,$$

where $$u = (u_1,…,u_N)^top$$ and $$v=(v_1,…,v_N)^top$$ and $$mathbf 1_N$$ is vector $$N times 1$$ all coordinates equal to $$1$$.

The residuals must satisfy the $$N$$ constraints that

$$y_i – mathbf x_i^toptheta = epsilon_i = u_i – v_i$$

This results in the formulation as a linear program

$$min_{theta in mathbb R^K,uin mathbb R_+^N,vin mathbb R_+^N}{ tau mathbf 1_N^top u + (1-tau)mathbf 1_N^top vlvert y_i= mathbf x_itheta + u_i – v_i, i=1,…,N},$$

as stated in Koenker (2005) "Quantile Regression" page 10 equation (1.20).

However it is noticeable that $$thetain mathbb R$$ is still not restricted to be positive as required in the linear program on standard form (1). Hence again decomposition into positive and negative part is used

$$theta = theta^+ – theta^-$$

where again $$theta^+=max(0,theta)$$ is positive part and $$theta^- = max(0,-theta)$$ is negative part. The $$N$$ constraints can then be written as

$$mathbf y:= begin{bmatrix} y_1 \ vdots \ y_Nend{bmatrix} = begin{bmatrix} mathbf x_1^top \ vdots \ mathbf x_N^top end{bmatrix}(theta^+ – theta^-) + mathbf I_Nu – mathbf I_Nv ,$$

where $$mathbf I_N = diag{mathbf 1_N}$$.

Next define $$b:=mathbf y$$ and the design matrix $$mathbf X$$ storing data on independent variables as

$$mathbf X := begin{bmatrix} mathbf x_1^top \ vdots \ mathbf x_N^top end{bmatrix}$$

To rewrite constraint:

$$b= mathbf X(theta^+ – theta^-) + mathbf I_N u- mathbf I_N v= [mathbf X , -mathbf X , mathbf I_N ,mathbf – mathbf I_N] begin{bmatrix} theta^+ \ theta^- \ u \ vend{bmatrix}$$

Define the $$(N times 2K + 2N )$$ matrix

$$A := [mathbf X , -mathbf X , mathbf I_N ,mathbf – mathbf I_N]$$ and introduce $$theta^+$$ and $$theta^-$$ as variables over which to minimize so they are part of $$z$$ to get

$$b = A begin{bmatrix} theta^+ \ theta^- \ u \ vend{bmatrix} = Az$$

Because $$theta^+$$ and $$theta^-$$ only affect the minimization problem through the constraint a $$mathbf 0$$ of dimension $$2Ktimes 1$$ must be introduced as part of the coeffient vector $$c$$ which can the appropriately be defined as

$$c = begin{bmatrix}mathbf 0 \ tau mathbf 1_N \ (1-tau) mathbf 1_N end{bmatrix},$$

thus ensuring that $$c^top z = underbrace{mathbf 0^top(theta^+ – theta^-)}_{=0}+tau mathbf 1_N^top u + (1-tau)mathbf 1_N^top v = sum_{i=1}^N rho_tau(epsilon_i).$$

Hence $$c,A$$ and $$b$$ are then defined and the program as given in $$(1)$$ completely specified.

This is probably best digested using an example. To solve this in R use the package quantreg by Roger Koenker. Here is also illustration of how to set up the linear program and solve with a solver for linear programs:

``base=read.table("http://freakonometrics.free.fr/rent98_00.txt",header=TRUE) attach(base) library(quantreg) library(lpSolve) tau <- 0.3   # Problem (1) only one covariate X <- cbind(1,base\$area) K <- ncol(X) N <- nrow(X)  A <- cbind(X,-X,diag(N),-diag(N)) c <- c(rep(0,2*ncol(X)),tau*rep(1,N),(1-tau)*rep(1,N)) b <- base\$rent_euro const_type <- rep("=",N)  linprog <- lp("min",c,A,const_type,b) beta <- linprog$$sol[1:K] - linprog$$sol[(1:K+K)] beta rq(rent_euro~area, tau=tau, data=base)   # Problem (2) with 2 covariates X <- cbind(1,base$$area,base$$yearc) K <- ncol(X) N <- nrow(X)  A <- cbind(X,-X,diag(N),-diag(N)) c <- c(rep(0,2*ncol(X)),tau*rep(1,N),(1-tau)*rep(1,N)) b <- base\$rent_euro const_type <- rep("=",N)  linprog <- lp("min",c,A,const_type,b) beta <- linprog$$sol[1:K] - linprog$$sol[(1:K+K)] beta rq(rent_euro~ area + yearc, tau=tau, data=base) ``

Rate this post