Please give me idea how to efficiently recode a categorical variable (factor) into the set of orthogonal polynomial contrast variables.

For many types of contrast variables (e.g. deviation, simple, Helmert, etc.) the pass is:

- Compose the contrast coefficients matrix correspondig to the type.
- Inverse or generalize-inverse it to obtain the matrix of codes.

For example:

`Suppose there is 3-group factor and we want to recode it into a set of deviation contrast variables. The last group is treated as reference. Then the contrast coefficients matrix L is Group1 Group2 Group3 var1 2/3 -1/3 -1/3 var2 -1/3 2/3 -1/3 and ginv(L) is then the sought-for coding matrix var1 var2 Group1 1 0 Group2 0 1 Group3 -1 -1 (We might also use inv(L) instead if we add a row for constant, equal to 1/3, at the head of L.) `

Is there the same or similar way to get polynomial contrast variables? If yes what matrix C would look like and how to compose it? If no what still *is* the way to compute the polynomial contrast variables efficiently (e.g. by matrix algebra).

**Contents**hide

#### Best Answer

As a segue to my prior post on this topic I want to share some tentative (albeit incomplete) exploration of the functions behind the linear algebra and related R functions. This is supposed to be a work in progress.

Part of the opaqueness of the functions has to do with the "compact" form of the Householder $mathbf {QR}$ decomposition. The idea behind the Householder decomposition is to reflect vectors across a hyperplane determined by a unit-vector $mathbf u$ as in the diagram below, but picking this plane in a purposeful way so as to project every column vector of the original matrix $bf A$ onto the $bf e_1$ standard unit vector. The normalized norm-2 $1$ vector $bf u$ can be used to compute the different Householder transformations $mathbf{ I – 2, uu^T x}$.

The resultant projection can be expressed as

$text{sign}(x_i=x_1)times lVert x rVert begin{bmatrix}1\0\0\vdots\0end{bmatrix}+begin{bmatrix}x_1\x_2\x_3\vdots\x_mend{bmatrix}$

The vector $bf v$ represents the difference between the column vectors $bf x$ in the matrix $bf A$ that we want to decompose and the vectors $bf y$ corresponding to the reflection across the subspace or "mirror" determined by $bf u$.

The method used by LAPACK liberates the need for storage of the first entry in the Householder reflectors by turning them into $1$'s. Instead of normalizing the vector $bf v$ to $bf u$ with $lVert urVert= 1$, it is just the fist entry that is converted to a $1$; yet, these new vectors – call them $bf w$ can still be used as a directional vectors.

The beauty of the method is that given that $bf R$ in a $bf QR$ decomposition is upper triangular, we can actually take advantage of the $0$ elements in $bf R$ below the diagonal to fill them in with these $bf w$ reflectors. Thankfully, the leading entries in these vectors all equal $1$, preventing a problem in the "disputed" diagonal of the matrix: knowing that they are all $1$ they don't need to be included, and can yield the diagonal to the entries of $bf R$.

The "compact QR" matrix in the function `qr()$qr`

can be understood as roughly the addition of the $mathbf R$ matrix and the lower triangular "storage" matrix for the "modified" reflectors.

The Householder projection will still have the form $mathbf{ I – 2, uu^T x}$, but we won't be working with $bf u$ ($lVert bf x rVert=1$), but rather with a vector $bf w$, of which only the first entry is guanteed to be $1$, and

$mathbf{ I – 2, uu^T x}=mathbf{ I – 2, frac{w}{lVert w rVert} frac{w^T}{lVert w rVert} x}=mathbf{ I – 2, frac{w,w^T}{lVert w rVert^2},x}tag{1}$.

One would assume that it would be just fine to store these $bf w$ reflectors below the diagonal or $bf R$ excluding the first entry of $1$, and call it a day. However, things are never so easy. Instead what is stored below the diagonal in `qr()$qr`

is a combination of $bf w$ and the coefficients in the Householder transformation expressed as (1), such that, defining $text{tau}$ as:

$Large tau = frac{w^T,w}{2}= frac{lVert bf w rVert}{2}$, the reflectors can be expressed as $text{reflectors}=large bf w/tau$. These "reflector" vectors are the ones stored right under $bf R$ in the so-called "compact $bf QR$".

Now we are one degree away from the $bf w$ vectors, and the first entry is no longer $1$, Hence the output of `qr()`

will need to include the key to restore them since we insist on excluding the first entry of the "reflector" vectors to fit everything in `qr()$qr`

. So are we seeing the $tau$ values in the output? Well, no that would be predictable. Instead in the output of `qr()$qraux`

(where this key is stored) we find $rho=frac{sum text{reflectors}^2}{2}= frac{bf w^Tw}{tau^2} / 2$.

So framed in red below, we see the "reflectors" ($bf w/tau$), excluding their first entry.

All the code is here, but since this answer is about the intersection of coding and linear algebra, I will paste the output for ease:

`options(scipen=999) set.seed(13) (X = matrix(c(rnorm(16)), nrow=4, byrow=F)) [,1] [,2] [,3] [,4] [1,] 0.5543269 1.1425261 -0.3653828 -1.3609845 [2,] -0.2802719 0.4155261 1.1051443 -1.8560272 [3,] 1.7751634 1.2295066 -1.0935940 -0.4398554 [4,] 0.1873201 0.2366797 0.4618709 -0.1939469 `

Now I wrote the function `House()`

as follows:

` House = function(A){ Q = diag(nrow(A)) reflectors = matrix(0,nrow=nrow(A),ncol=ncol(A)) for(r in 1:(nrow(A) - 1)){ # We will apply Householder to progressively the columns in A, decreasing 1 element at a time. x = A[r:nrow(A), r] # We now get the vector v, starting with first entry = norm-2 of x[i] times 1 # The sign is to avoid computational issues first = (sign(x[1]) * sqrt(sum(x^2))) + x[1] # We get the rest of v, which is x unchanged, since e1 = [1, 0, 0, ..., 0] # We go the the last column / row, hence the if statement: v = if(length(x) > 1){c(first, x[2:length(x)])}else{v = c(first)} # Now we make the first entry unitary: w = v/first # Tau will be used in the Householder transform, so here it goes: t = as.numeric(t(w)%*%w) / 2 # And the "reflectors" are stored as in the R qr()$qr function: reflectors[r: nrow(A), r] = w/t # The Householder tranformation is: I = diag(length(r:nrow(A))) H.transf = I - 1/t * (w %*% t(w)) H_i = diag(nrow(A)) H_i[r:nrow(A),r:ncol(A)] = H.transf # And we apply the Householder reflection - we left multiply the entire A or Q A = H_i %*% A Q = H_i %*% Q } DECOMPOSITION = list("Q"= t(Q), "R"= round(A,7), "compact Q as in qr()$qr"= ((A*upper.tri(A,diag=T))+(reflectors*lower.tri(reflectors,diag=F))), "reflectors" = reflectors, "rho"=c(apply(reflectors[,1:(ncol(reflectors)- 1)], 2, function(x) sum(x^2) / 2), A[nrow(A),ncol(A)])) return(DECOMPOSITION) } `

Let's compare the ouput to the R built-in functions. First the home-made function:

`(H = House(X)) $Q [,1] [,2] [,3] [,4] [1,] -0.29329367 -0.73996967 0.5382474 0.2769719 [2,] 0.14829152 -0.65124800 -0.5656093 -0.4837063 [3,] -0.93923665 0.13835611 -0.1947321 -0.2465187 [4,] -0.09911084 -0.09580458 -0.5936794 0.7928072 $R [,1] [,2] [,3] [,4] [1,] -1.890006 -1.4517318 1.2524151 0.5562856 [2,] 0.000000 -0.9686105 -0.6449056 2.1735456 [3,] 0.000000 0.0000000 -0.8829916 0.5180361 [4,] 0.000000 0.0000000 0.0000000 0.4754876 $`compact Q as in qr()$qr` [,1] [,2] [,3] [,4] [1,] -1.89000649 -1.45173183 1.2524151 0.5562856 [2,] -0.14829152 -0.96861050 -0.6449056 2.1735456 [3,] 0.93923665 -0.67574886 -0.8829916 0.5180361 [4,] 0.09911084 0.03909742 0.6235799 0.4754876 $reflectors [,1] [,2] [,3] [,4] [1,] 1.29329367 0.00000000 0.0000000 0 [2,] -0.14829152 1.73609434 0.0000000 0 [3,] 0.93923665 -0.67574886 1.7817597 0 [4,] 0.09911084 0.03909742 0.6235799 0 $rho [1] 1.2932937 1.7360943 1.7817597 0.4754876 `

to the R functions:

`qr.Q(qr(X)) [,1] [,2] [,3] [,4] [1,] -0.29329367 -0.73996967 0.5382474 0.2769719 [2,] 0.14829152 -0.65124800 -0.5656093 -0.4837063 [3,] -0.93923665 0.13835611 -0.1947321 -0.2465187 [4,] -0.09911084 -0.09580458 -0.5936794 0.7928072 qr.R(qr(X)) [,1] [,2] [,3] [,4] [1,] -1.890006 -1.4517318 1.2524151 0.5562856 [2,] 0.000000 -0.9686105 -0.6449056 2.1735456 [3,] 0.000000 0.0000000 -0.8829916 0.5180361 [4,] 0.000000 0.0000000 0.0000000 0.4754876 $qr [,1] [,2] [,3] [,4] [1,] -1.89000649 -1.45173183 1.2524151 0.5562856 [2,] -0.14829152 -0.96861050 -0.6449056 2.1735456 [3,] 0.93923665 -0.67574886 -0.8829916 0.5180361 [4,] 0.09911084 0.03909742 0.6235799 0.4754876 $qraux [1] 1.2932937 1.7360943 1.7817597 0.4754876 `

### Similar Posts:

- Solved – Computation of polynomial contrast variables
- Solved – Derivation of Group Lasso
- Solved – Why is “weight clipping” needed for Wasserstein GANs
- Solved – Is the Gaussian Kernel still a valid Kernel when taking the negative of the inner function
- Solved – Is the Gaussian Kernel still a valid Kernel when taking the negative of the inner function