I took a look at several unsupervised feature learning algorithms. Most of them (restricted Boltzmann machines and sparse auto-encoders) have very long training times even on small datasets like MNIST. I wonder if there are similar algorithms that can be trained in less time?
Similar algorithms that seem to be promising might be sparse filtering and reconstruction cost ICA (RICA). Are there maybe more?
Another problem with some of these algorithms is: most of them require batch training with algorithms like L-BFGS. In an online setting it would most probably be hard to train them without caching a batch of instances for training. Are there alternatives? Is there anything in the literature? I could not find anything.
K-means is pretty fast, as is PCA. If you use a sparse SVD library (like the irlba package for R) you can approximate PCA pretty quickly on large datasets.
I think there's some pretty fast algorithms for online (also known as sequential) k-means.
- Solved – the difference between online and batch Learning?
- Solved – How to determine the best batch-size value for Mini Batch K-means algorithm
- Solved – the difference between Online Learning Algorithms and Streaming Learning Algorithms
- Solved – full batch vs online learning vs mini batch
- Solved – Stochastic gradient descent Vs Mini-batch size 1