Search

Question

What is The Lottery Ticket Hypothesis?

Answer 1

A randomly-initialized, dense neural network contains a subnetwork that is initialized such that - when trained in isolation - it can match the test accuracy of the original network after training for at most the same number of iterations.

Answer 2

When the parameters of the winning tickets are randomly reinitialized \((f(x;m\bigodot\theta'_0)\), these winning tickets no longer match the performance of the original network, offering evidence that these smaller networks do not train effectively unless they are appropriately initialized.

Answer 3

Iterative pruning with resetting the remaining parameters to their initial values (so also retraining from scratch after every pruning step).

Answer 4

When randomly reinitialized, a winning ticket learns more slowly and achieves lower test accuracy, suggesting that initialization is important to its success.

Experiments have shown that the winning ticket weights move further than other weights. This suggests that the benefit of the initialization is connected to the optimization algorithm, dataset, and model. For example, the winning ticket initialization might land in a region of the loss landscape that is particularly amenable to optimization by the chosen optimization algorithm.

Answer 5

They only considered vision-centric classification tasks on smaller datasets (MNIST, CIFAR10). Iterative pruning is computationally intensive, requiring training a network 15 or more times consecutively for multiple trials. This pruning flow is also the only method to find winning tickets.