I have a large word-frequency matrix (~6m unique words X ~4k documents) and I'm trying to use truncated singular value decomposition (SVD) to project it onto a matrix with fewer dimensions. I know how to get to $[U_{k}, S_{k}, V_{k}']$, but I don't know what to do then.
Each tutorial I find gives me a different answer. Here, for instance, I'm told to retain only the first $k$ rows of $S_{k}$ and $V_{k}'$ and then multiply $S_{k}V_{k}'$. In the scikit-learn implementation, however, they do it by multiplying $U_{k}S_{k}'$ (formula and code). (The book chapter they mention as source doesn't say anything about that.) In this previous answer they suggest $X'U_{k}$ instead (though they are talking about PCA, so maybe that doesn't apply here). Other answers haven't helped (here, for instance, they say we need to apply truncated SVD to the reduced matrix – which doesn't sound right, we use truncated SVD precisely to get to the reduced matrix, no?).
So, bottom line: how do I go from $[U_{k}, S_{k}, V_{k}']$ to a reduced data matrix? Which of the above formulas is the correct one (and why)?
Just to be clear: I'm not looking to reduce rank only (that I can do by multiplying $U_{k}S_{k}V_{k}'$ but it gives me the exact same dimensions I had before, which doesn't help me); I want to reduce the actual dimensions of the data matrix.
Best Answer
Let's say you have 4×2 X matrix, run SVD and get U,S,V:
- U is 4×4
- S is 4×2, where only the top 2×2 is not empty
- V is 2×2
If you want to truncate to a single largest singular value:
- remove all columns from U leaving only the first one, now U1 is 4×1
- remove all columns from S leaving only the first one, now S1 is 1×1
- remove all columns from V leaving only the first one, now V1 is 2×1
You can restore the X matrix by U1*S1*V1'
The dimensionality is greatly reduced because now you can use 4×1 U matrix instead of original 4×2 X matrix.
UPDATE: you also have neat equality: X*V=U*S, so you could use both X*V1 or U1*S1 as your reduced dimension matrices, or simply U1, which is similar to the score in PCA
Similar Posts:
- Solved – How to use SVD for dimensionality reduction
- Solved – Matrix dimensions of the svd() output in R
- Solved – Clear description of PCA using SVD of covariance matrix
- Solved – Clear description of PCA using SVD of covariance matrix
- Solved – PCA on train and test datasets: should I run one PCA on train+test or two separate on train and on test?