What is forward-mode differentiation?

Forward-mode differentiation is a method used in automatic differentiation. It is a technique that computes numerical derivatives by simultaneously performing elementary derivative operations while evaluating the function. The chain rule is used to update the derivative values at each step.

Let's now understand this concept with an example.

Example

In order to break down functions into elementary steps, evaluation traces are constructed. These traces can be thought of as a record of the individual steps taken to obtain the final results. Let's take the following function as an example:

f(x,y)=cos(x)+x(ey)f(x, y) = cos(x) + x(e^y)

To construct the evaluation trace, we will substitute some variables inside the function at each step. To start with, let's substitute xxwith w1w_1 and yy with w2w_2 in the above equation.

The equation becomes:

f(x,y)=cos(w1)+w1(ew2)f(x, y) = cos(w_1) + w_1(e^{w_2})

Let's now substitute cos(w1)cos(w_1) with w3w_3 and ew2e^{w_2} with w4w_4 in the above equation.

The equation now becomes:

f(x,y)=w3+w1.w4f(x, y) = w_3 + w_1.w_4

Finally, we substitute w1.w4w_1.w_4 with w5w_5in the above equation and the equation then becomes:

f(x,y)=w3+w5f(x, y) = w_3 + w_5

Let’s now evaluate the function when x=π2x = \frac{\pi}{2} and y=1y =1, and record all the intermediate values in the table below.

Table 1
Table 1

Setting the initial conditions

Let's set the initial conditions for the derivatives:

  • w1w_1': w1x=1\frac{\partial{w_1}}{\partial{x}} =1

  • w2w_2': w2x=0\frac{\partial{w_2}}{\partial{x}} =0

By setting the seed values for the derivatives of the variables (in this case, w1w_1' and w2w_2'), we establish the starting point for the differentiation process. These initial conditions act as the base values from which the derivatives will be computed and propagated forward through the computational graph.

Computing the partial derivative

Let's suppose we want to compute the partial derivative of yy with respect to xx, with x=π2x = \frac{\pi}{2} and y=1y = 1. We can approach this task by considering one intermediate variable at a time. It's important to note that we are focusing solely on the numerical value of the derivative. For each wiw_i, we calculate wix\frac{\partial{w_i}}{\partial{x}} .

Let’s try to calculate the partial derivative of w3w_3.

Note: We will be using the following two expressions to represent the partial derivative of w3w_3:w3w_3' or w3x\frac{\partial{w_3}}{\partial{x}}.

w3x=cos(w1)x\frac{\partial{w_3}}{\partial{x}} = \frac{\partial{cos(w_1)}}{\partial{x}}

w3x=w1.sin(w1)\frac{\partial{w_3}}{\partial{x}} = -w_1'.sin(w_1)

substituting w1w_1 with xx

w3x=xx.sin(x)\frac{\partial{w_3}}{\partial{x}} = -\frac{\partial{x}}{\partial{x}}.sin(x)

w3x=1sin(x)\frac{\partial{w_3}}{\partial{x}} = -1sin(x)

substituting xx with π2\frac{\pi}{2}

w3x=1sin(π2)\frac{\partial{w_3}}{\partial{x}} = -1sin(\frac{\pi}{2})

w3x=1\frac{\partial{w_3}}{\partial{x}} = -1

The results of the partial derivatives of w4w_4, w5w_5, and w6w_6 are provided in the table below:

Table 2
Table 2

For each intermediate variable, we calculate its derivative by applying derivative rules. Remember that the value of each intermediate variable depends only on the derivatives and values of previous variables.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved