What are the different activation functions in Keras?

Keras is an open-source library for deep learning, which allows researchers and developers to build neural networks with a certain level of abstraction by hiding low-level operations. Click on this Answer to learn about the advantages of using Keras for deep learning.

In neural networks, the activation function is a mathematical function applied to the neuron output in a neural layer. These are used to introduce non-linearity in a neural network, which allows the building of accurate, optimized, and efficient models. To use activation functions, first install Keras. You can learn how to do that by visiting this Answer.

Activation functions in a neural network can be added in two ways: the activation layer or the activation parameter.

Syntax

The syntax for adding an activation function in an argument is:

model.add(layers.Dense(64, activation='name_of_the_function')

Another way to add an activation function is by adding an activation layer:

from tensorflow.keras import layers
from tensorflow.keras import activations
model.add(layers.Dense(64))
model.add(layers.Activation(activations.function_name))
Adding activation by adding an activation layer

It is also helpful to learn about some commonly used activation functions, some of which are described below.

Sigmoid

It is an activation function that constricts the output of a neuron to a range between 0 and 1. It is recommended to use sigmoid for binary classification problems.

The mathematical equation for the sigmoid activation function is as follows:

xx is the output of the neuron. After passing it through the activation function, it is ready as the input to the next layer.

The syntax to add the sigmoid activation function to the model is:

model.add(layers.Dense(64, activation='sigmoid'))

Softmax

It is an activation function used in the network's output layer to convert the values into a probability distribution. The values of the probability distribution vary between 0 and 1. The sum of all the values is equal to 1. Softmax is used in multi-class classification problems.

The mathematical equation is as follows:

Here:

  • x\vec{x} is the input vector.

  • exie^{x_i} is the exponential function for the input vector.

  • exke^{x_k} is the exponential function for the output vector.

  • nn is the number of classes.

The syntax to add the function in a neural layer is:

model.add(layers.Dense(64, activation='softmax'))

Rectified linear unit (ReLU)

It is a commonly used activation function that takes a tensor, and if the value in the tensor is negative, it changes it to zero. The positive value remains unchanged. The threshold can also be changed from zero.

The mathematical equation for a ReLU is:

The syntax to add the function in a neural layer is:

model.add(layers.Dense(64, activation='relu'))

Hyperbolic tangent (tanh)

Tanh is like the sigmoid activation function, except the range it maps the values to is between -1 and 1. We use the tanh activation function, where the negative values are important for the model. It is mostly used in the hidden layer of the neural network.

The mathematical equation is:

The syntax to add the function in a neural layer is:

model.add(layers.Dense(64, activation='tanh'))

Leaky ReLU

The ReLu activation function suffers from the problem of dying ReLU.During training, some neurons in the network die when we use large learning rate values. The ReLU keeps producing zero as the output as a larger number of the neurons die. This is called the dying ReLU problem.

The Leaky ReLU is considered a variant of the ReLU function and the solution to the problem of dying ReLU faced by the ReLU activation function.

Its mathematical equation is:

The syntax to add the function in a neural layer is:

from tensorflow.keras.layers import LeakyReLU
leakyrelu = LeakyReLU(alpha=0.02)
model.add(layers.Dense(64, activation='leakyrelu'))
Add leakyrelu in the dense layer

Conclusion

Keras is a deep learning library used for optimizing efficient deep learning models built on top of TensorFlow. It uses loss and activation functions to make the model efficient and produce accurate results.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved