The sigmoid activation function is a widely used mathematical function in the field of machine learning and artificial neural networks. It is a type of activation function that maps any input value to a range between
The formula for the sigmoid function is:
Where,
Let’s see the implementation of the sigmoid activation function in PyTorch.
import numpy as np# Define the sigmoid functiondef sigmoid(x):return 1/(1+np.exp(-x))# List of input valuesinput_values = [-2.0, -1.0, 0.0, 1.0, 2.0]# An empty list to store output valuesoutput_values=[]# Apply sigmoid function to the list of input valuesfor input_value in input_values:output_value = sigmoid(input_value)output_values.append(output_value)# Print the resultsprint("Input values: ", input_values)print("Output values after Sigmoid function: ", output_values)
Lines 4–5: We create a sigmoid function using the above-mentioned mathematical formula.
Line 8: We then define a list of input_values
which we’ll pass through the sigmoid function.
Line 11: We initialize an empty list output_values
to store the output of the sigmoid function.
Lines 15–17: We pass each input value from the input_values
list through the sigmoid function and append the result to the output_values
list. Finally, we print out both input and output lists.
Let’s see the implementation of the sigmoid activation function in a neural network using PyTorch. The neural network defined in the code consists of two fully connected layers with ReLU
and Sigmoid
activation between them. The input layer has 64
nodes, the hidden layer has 128
nodes, and the output layer has 2
nodes with a sigmoid activation function. The model is trained using binary cross-entropy loss and optimized with the Adam
optimizer. The training loop runs for 10
epochs, performing forward and backward passes to update the model parameters.
import torchimport torch.nn as nn# Define the neural network architectureclass Neural_Network(nn.Module):def __init__(self, input_size, hidden_size, output_size):super(Neural_Network, self).__init__()self.fc1 = nn.Linear(input_size, hidden_size) # Fully connected layer 1self.relu = nn.ReLU() # ReLU activation functionself.fc2 = nn.Linear(hidden_size, output_size) # Fully connected layer 2self.sigmoid = nn.Sigmoid()def forward(self, x):out = self.fc1(x) # Apply the first fully connected layerout = self.relu(out) # Apply the ReLU activation functionout = self.fc2(out) # Apply the second fully connected layerout = self.sigmoid(out)return out# Define network parametersinput_size = 64 # Number of input featureshidden_size = 128 # Number of neurons in the hidden layeroutput_size = 2 # Number of output classes# Input datainput_data = torch.rand(32, input_size) # 32 is the batch sizetarget = torch.randint(0, 2, (32, output_size), dtype=torch.float32) # Random binary target values# Create an instance of the SimpleNN modelmodel = Neural_Network(input_size, hidden_size, output_size)# Define the loss function (Binary Entropy Loss) and optimizer (Adam)criterion = nn.BCELoss()optimizer = torch.optim.Adam(model.parameters(), lr=0.01)# Example training loopn_epochs = 10 # Define the number of training epochsfor epoch in range(n_epochs):# Forward passoutputs = model(input_data)loss = criterion(outputs, target) # Compute the loss# Backward pass and optimizationoptimizer.zero_grad() # Clear gradientsloss.backward() # Backpropagate to compute gradientsoptimizer.step() # Update the model parameters# Print the loss for each epochprint(f'Epoch [{epoch + 1}/{n_epochs}], Loss: {loss.item()}')
Line 11: We create an instance of the sigmoid activation function using the nn.Sigmoid()
command, and store it as an attribute of the Neural_Network
class named self.sigmoid
.
Line 17: We apply the sigmoid activation function to the output of the second fully connected layer. It transforms the raw model outputs into probabilities, enabling the model to make predictions for binary classification tasks. It introduces nonlinearity and ensures the output is within a meaningful probability range. It basically squashes the input values coming after the second fully connected layer into a range between
For more details on how to build a neural network take a look at this Educative Answer.
Free Resources