Self-organizing maps are claimed to be an approach for dimensionality reduction. However, I am kind of confused about this claim.
Consider the following example, I have a data set with 200 data points and each data point is represented by a feature vector with 1000 dimensions. Assume I would like to train a map with a $1 times 2$ grid. In other words, I will map these 200 points to two cells. Just for illustration purposes, suppose the first 100 points are mapped onto the first cell and the latter 100 points are mapped onto the second cell. From the viewpoint of dimensionality reduction, we can say we use a 1-dimensional output space to represent the original 1000-dimensional space. But this really confuses me a lot. According to this example, it looks to me that the first 100 points will share the same feature vector, which is just 1 in the one-dimensional space; and the latter 100 points will share the same feature vector, which is 2 in the same one-dimensional space. Is my understanding correct?
In your extreme example, I'd say your view is correct. You specified that you wanted a reduction to one dimension with two possible values in that dimension and that's what you got. As Wikipedia says, SOM creates a discretized low-dimensional representation.
Perhaps the issue is how SOM does this. Let's say you specified a 3×3 SOM, which is a 2-D grid with 9 points. The SOM algorithm embeds this 2-D grid in your 1000-D space, as Neil G points out. It then adapts the 9 points to the data in such a way as to maintain the manifold's smoothness in 1000-D space, while moving the grid points to denser (in terms of your data) areas. (It does this by propagating changes to closest SOM points in the 2-D manifold, not to neighbors in the 1000-D space.)
So, each of the 9 points in your SOM grid has a 1000-D location in 1000-D space, but after the algorithm is finished, you are considering the 9 points in the 3×3 grid itself, reducing 1000-D space to (discretized) 2-D space.
You could also look at this as a kind of clustering of your data around 9 points that are constrained in relationship to each other.
- Solved – Using self organizing maps for dimensionality reduction
- Solved – How exactly do neural network and support vector machine methods reduce dimensionality
- Solved – the difference between manifold learning and non-linear dimensionality reduction
- Solved – LDA for dimensionality reduction usage
- Solved – Difference between SOM and Hopfield