# Solved – the difference between stride and subsample in convolutional neural networks

Is there any difference between stride and subsample in convolutional neural networks?

Contents

## Remark

Any strided convolutional or subsampling layer does achieve downsampling of the input. So, in a way, refering to "subsampling" in the context of a convolutional layer probably signifies using stride. `strides` for `Convolutional2D` layers was called `subsample` in Keras 1.2.2, for example (now it has been changed to simply `strides` in `Conv2D`).

## On Subsampling, Mean Pooling, and Convolutional layers

If you mean "subsample" as in the use of Subsampling layers, Subsampling is a generalization of Mean Pooling, with learnable weights. It may, or may not use striding.

$$text{output}[i][j][k] = text{bias}[k] + text{weight}[k] sum_{s=1}^{kW} sum_{t=1}^{kH} text{input[}dWcdot(i-1)+s)][dHcdot(j-1)+t][k]$$

The output has the same number of channels $$k$$ as the input.

And here's the output of SpatialAveragePooling

$$text{output}[i][j][k] = {1over kWcdot kH} sum_{s=1}^{kW} sum_{t=1}^{kH} text{input[}dWcdot(i-1)+s)][dHcdot(j-1)+t][k]$$

The only difference between both is that Subsampling allows different weights per channel. But a channel from the input maps to the same channel in the output.

Compare it to the output of SpatialConvolution, where all channels in the input contribute to all channels in the output:

$$text{output}[i][j][k] = text{bias}[k] + sum_l sum_{s=1}^{kW} sum_{t=1}^{kH} text{weight}[s][t][l][k] cdot text{input}[dWcdot(i-1)+s)][dHcdot(j-1)+t][l]$$

So, that's it, Subsampling is not generally equivalent to strided Convolution, because their mappings are different.

Rate this post