Solved – Inference Time in Neural Networks

Is the inference time directly proportional to both the number of operations in a network as well as the number of parameters? Or is it directly proportional to no. of operations and indirectly proportional to no. of parameters?

There's no reason time should be proportional to the number of parameters, for example, you could imagine a fully connected one layer network which computes $y = sigma(w^Tx)$ and a simple RNN which computes $y_0 = 0, y_i = sigma(y_{i-1}+wx_i)$. Both involve roughly $n$ multiplications / additions, but the RNN only has a single parameter, and the fully connected network has $n$.

As for whether time is proportional to number of operations, that depends on what you mean by "time". Sometimes when computer scientists talk about time, we're interested in some theoretical measure of how long it takes to compute something, and this measure is usually defined as "number of operations". So it's tautological that the run time is proportional to the number of operations.

On the other hand, if you care about runtime in real life, then there's no straightforward relationship between number of operations and time. The fully connected network described above can be very easily parallelized and the dot product effectively done in a single "cycle", whereas the RNN output needs to be computed sequentially, taking $n$ cycles.

Similar Posts:

Rate this post

Leave a Comment