Learning rate is a small number ranging between 0.0 and 1.0 that is used to train a neural network. The model needs to learn to minimize loss, which is the error between its predicated value for an input and the real label for that input. To do so, the
The formula for updating weights is as follows:
As shown, the weights are updated to reach their optimum value with every epoch, which determines how fast the model learns and adapts. If one chooses to use a value too big, the model will train faster, but it may arrive at values that are not
Moreover, the learning rate is known as a hyperparameter because we set it ourselves for each model. Generally, the symbol is used to represent the learning rate.
The optimal learning rate is determined through trial and error; this is because it is not possible to predict or analyze what learning rate will fit a model beforehand. It is advised to start with values 0.1 or 0.01. After trying different learning rates, one can analyze how fast or slow a model is learning over the epochs.
Free Resources