Implementation of sigmoid activation function in PyTorch

The sigmoid activation function is a widely used mathematical function in the field of machine learning and artificial neural networks. It is a type of activation function that maps any input value to a range between $0$ and $1$ . It’s known for its characteristic S-shaped curve and is particularly useful in models where we need to predict probabilities or perform binary classification tasks.

Mathematical representation

The formula for the sigmoid function is:

Code explanation

Lines 4–5: We create a sigmoid function using the above-mentioned mathematical formula.

Line 8: We then define a list of input_values which we’ll pass through the sigmoid function.

Line 11: We initialize an empty list output_values to store the output of the sigmoid function.

Lines 15–17: We pass each input value from the input_values list through the sigmoid function and append the result to the output_values list. Finally, we print out both input and output lists.

Implementation with neural network

Let’s see the implementation of the sigmoid activation function in a neural network using PyTorch. The neural network defined in the code consists of two fully connected layers with ReLU and Sigmoid activation between them. The input layer has 64 nodes, the hidden layer has 128 nodes, and the output layer has 2 nodes with a sigmoid activation function. The model is trained using binary cross-entropy loss and optimized with the Adam optimizer. The training loop runs for 10 epochs, performing forward and backward passes to update the model parameters.

import torch
import torch.nn as nn
# Define the neural network architecture
class Neural_Network(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(Neural_Network, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)  # Fully connected layer 1
        self.relu = nn.ReLU()  # ReLU activation function
        self.fc2 = nn.Linear(hidden_size, output_size)  # Fully connected layer 2
        self.sigmoid = nn.Sigmoid()
    def forward(self, x):
        out = self.fc1(x)  # Apply the first fully connected layer
        out = self.relu(out)  # Apply the ReLU activation function
        out = self.fc2(out)  # Apply the second fully connected layer
        out = self.sigmoid(out)
        return out
# Define network parameters
input_size = 64  # Number of input features
hidden_size = 128  # Number of neurons in the hidden layer
output_size = 2  # Number of output classes
# Input data
input_data = torch.rand(32, input_size)  # 32 is the batch size
target = torch.randint(0, 2, (32, output_size), dtype=torch.float32)  # Random binary target values
# Create an instance of the SimpleNN model
model = Neural_Network(input_size, hidden_size, output_size)
# Define the loss function (Binary Entropy Loss) and optimizer (Adam)
criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
# Example training loop
n_epochs = 10  # Define the number of training epochs
for epoch in range(n_epochs):
    # Forward pass
    outputs = model(input_data)
    loss = criterion(outputs, target)  # Compute the loss
    # Backward pass and optimization
    optimizer.zero_grad()  # Clear gradients
    loss.backward()  # Backpropagate to compute gradients
    optimizer.step()  # Update the model parameters
    # Print the loss for each epoch
    print(f'Epoch [{epoch + 1}/{n_epochs}], Loss: {loss.item()}')

Code explanation

Line 11: We create an instance of the sigmoid activation function using the nn.Sigmoid() command, and store it as an attribute of the Neural_Network class named self.sigmoid.

Line 17: We apply the sigmoid activation function to the output of the second fully connected layer. It transforms the raw model outputs into probabilities, enabling the model to make predictions for binary classification tasks. It introduces nonlinearity and ensures the output is within a meaningful probability range. It basically squashes the input values coming after the second fully connected layer into a range between $0$ and $1$ .

For more details on how to build a neural network take a look at this Educative Answer.

Implementation of sigmoid activation function in PyTorch

Mathematical representation

Implementation of the sigmoid function

Code explanation

Implementation with neural network

Code explanation