I'm trying to replicate results of this paper using Theano. The problem at the moment is, all Theano-related tutorials are only for MNIST classifiers, which isn't much use in unsupervised image retrieval.
I have following idea how to approach the implementation of this problem:
First, I have to train a stack of inter-connected RBMs. When the RBMs are sufficiently trained, I will grab weight matrices and bias vectors from those matrices and construct a deep autoencoder. This autoencoder will then be trained in the traditional way, using back-propagation.
Is my train of thought correct, or did I miss something essential?
Basically, you want to use layer-wise approach to train your deep autoencoder. You want to train one layer at a time, and then eventually do fine-tuning on all the layers.
You can follow this stanford UFLDL tutorial. It is a great tutorial for deep learning (have stacked autoencoder; not built from RBM. But it should be enough to give you some ideas).
You can use this github as reference: https://github.com/johnny5550822/Ho-UFLDL-tutorial