In this Answer, we’ll see how we can backpropagate the loss in a neural network and update the parameters to make the model learn from the data. Consider the following neural network with a cross-entropy loss function:
Note that every layer in layers 2–5 has a weight matrix, and each node in the layers has an associated bias. The output of these layers is given by computing the sigmoid of the forward propagation equation:
Where,
The binary cross-entropy loss is given as follows:
Where,
We need to perform parameter updates using the following equation:
and,
Where,
Therefore, we will backpropagate each parameter’s share in the cost/loss and update it by the fraction determined by the learning rate.
Let’s obtain the derivatives () for the layer five.
For the last layer:
where,
and,
because,
we have,
Therefore we get the equation for computing :
Likewise for , because:
we get,
This multiple is referred to as (delta), and we will refer to as .
Similarly, we can find by the equation below:
where,
and,
and,
so becomes:
Likewise for :
where,
so,
We will refer to the multiple as .
We need to take into account the contributions to the error by each node in a layer, to be able to find the gradient updates of weights of that node. This contribution for a nodes of the second last layer is:
where,
and,
Therefore,
To compute :
We can then compute:
Also, similarly compute the gradients for the remaining hidden layers.
Unlock your potential: Neural network series, all in one place!
To continue your exploration of Neural network, check out our series of Answers below:
What are artificial neural networks?
Learn how artificial neural networks (ANNs), inspired by the human brain, perform tasks like classification and prediction through interconnected layers and neurons.
Why do we use neural networks?
Learn how neural networks offer high approximation and representational power, enabling valuable data utilization and excelling in tasks like automated image classification.
Training of a neural network using pytorch
Learn how artificial neural networks mimic brain functions to process data, and how PyTorch simplifies building and training them using layers, weights, loss functions, and backpropagation.
How neural language models work in ChatGPT
Learn how ChatGPT uses transformer architecture with a focus on the decoder, leveraging vast data and attention mechanisms to generate coherent responses.
Benefits and Limitations of Neural Machine Translation in ChatGPT
Learn how ChatGPT's neural machine translation offers efficient, accurate language translations, while acknowledging its limitations due to its novelty.
What are Graph Neural Networks?
Learn how Graph Neural Networks (GNNs) handle non-Euclidean data using graphs, excelling in clustering, visualization, prediction, NLP, molecule structures, cybersecurity, and social network analysis.
What is a neural network-based approach for graph embeddings?
Learn how graph embeddings use neural networks like GCNs to represent graph data as vectors, enabling efficient analysis and tasks like node classification and link prediction.
How to avoid overfitting in neural network
Learn how to use cross-validation, regularization, dropout, early stopping, and data augmentation to effectively avoid overfitting in machine learning models.
How to Do Back Propagation in a Neural Network
Learn how to calculate gradients using backpropagation to update neural network parameters and improve learning from data actions.
PyTorch cheatsheet: Neural network layers
PyTorch provides diverse neural network layers, enabling the design and training of complex models for tasks like image classification, sequence modeling, and reinforcement learning.
Free Resources