Dropout or dilution is a technique for regularizing neural networks, developed by Hinton et al. and published in the paper Improving neural networks by preventing co-adaptation of feature detectors.
The core idea behind dropout is to randomly set some of the weights in a neural network to \(0\) during the training phase.
Dropout add a hyperparameter of a “keep probability”. This is the probability that a weight value is left undisturbed– that it will not be set to \(0\).
Dropout can be considered analogous to model-averaging because the process simulates training many similar neural networks on the same data.