Anomaly detection with autoencoders

In this Answer, we’ll explore the fascinating field of anomaly detection using PyTorch.

Introduction to anomaly detection

Anomaly detection, also known as outlier detection, is a crucial aspect of machine learning where the goal is to identify instances or patterns in a dataset that deviate significantly from the norm or expected behavior. This deviation from the norm is often called an anomaly or outlier. Anomalies can represent rare events, errors, fraud, faults, or any other unusual occurrences that are not typical in the dataset.

Leveraging autoencoders for anomaly detection

Autoencoders, a type of neural network, offer a powerful technique for anomaly detection. The fundamental idea behind using autoencoders for this task is their ability to learn a compact input data representation. An autoencoder consists of an encoder that compresses input data into a lower-dimensional representation (encoding) and a decoder that reconstructs the input data from this encoding.

Dataset visualization

We’ll be working with a synthetic time series dataset. Visualization is a crucial first step as it helps us understand the structure of our data and any patterns that may exist. Our goal is to identify anomalies in this time series data.

main.py

Autoencoder.py

synthetic_data_generator.py

import torch
import torch.nn as nn
import torch.optim as optim
from Autoencoder import Autoencoder  # Assume that the autoencoder class is defined in Autoencoder.py
from synthetic_data_generator import generate_synthetic_data
# Load the synthetic dataset
regular_data = generate_synthetic_data()
# Load autoencoder architecture
input_dim = 1
hidden_dim = 64
latent_dim = 32
encoder_dim = (input_dim, hidden_dim, latent_dim)
decoder_dim = (latent_dim, hidden_dim, input_dim)
autoencoder = Autoencoder(encoder_dim, decoder_dim)
# Define the loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(autoencoder.parameters(), lr=0.001)
# Train the autoencoder
num_epochs = 50
for epoch in range(num_epochs):
    # Forward pass
    outputs = autoencoder(regular_data)
    loss = criterion(outputs, regular_data)
    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')
# Save the trained model
torch.save(autoencoder.state_dict(), "anomaly_detector.pth")

Code explanation

Lines 1–5: Import necessary libraries and modules.
Line 9: Load the synthetic time series dataset.
Lines 12–17: Load autoencoder architecture and instantiate the Autoencoder class.
Lines 20–21: Define mean squared error (MSE) loss and set up the Adam optimizer.
Lines 24–39: Train the autoencoder for a specified number of epochs, update parameters, and monitor loss.

Testing the autoencoder and visualizing the results

Let’s put our trained autoencoder to the test by feeding it with the regular time series data and visualizing the reconstruction errors. Anomalies will likely result in higher reconstruction errors, allowing us to identify and visualize them.

main.py

Autoencoder.py

synthetic_data_generator.py

import torch
import matplotlib.pyplot as plt
import torch.nn as nn
import torch.optim as optim
from Autoencoder import Autoencoder  
from synthetic_data_generator import generate_synthetic_data
from synthetic_data_generator import time
from synthetic_data_generator import anomaly_index
# Load the synthetic dataset
regular_data = generate_synthetic_data()
# Load autoencoder architecture
input_dim = 1
hidden_dim = 64
latent_dim = 32
encoder_dim = (input_dim, hidden_dim, latent_dim)
decoder_dim = (latent_dim, hidden_dim, input_dim)
autoencoder = Autoencoder(encoder_dim, decoder_dim)
# Load the trained model
autoencoder.load_state_dict(torch.load("anomaly_detector.pth"))
autoencoder.eval() 
# Predict on the regular data
predicted_data = autoencoder(regular_data)
# Calculate the reconstruction error
reconstruction_error = torch.mean((regular_data - predicted_data)**2, dim=1).detach().numpy()
# Plot the regular data and reconstruction errors
plt.figure(figsize=(10, 6))
plt.subplot(2, 1, 1)
plt.plot(time, regular_data.numpy(), label='Regular Data')
plt.scatter(anomaly_index, regular_data[anomaly_index].item(), color='red', label='Anomaly')
plt.title('Regular Time-Series Data with Anomaly')
plt.xlabel('Time')
plt.ylabel('Value')
plt.legend()
plt.subplot(2, 1, 2)
plt.plot(time, reconstruction_error, label='Reconstruction Error', color='orange')
plt.axvline(x=anomaly_index, color='red', linestyle='--', label='Anomaly Index')
plt.title('Reconstruction Error over Time')
plt.xlabel('Time')
plt.ylabel('Reconstruction Error')
plt.legend()
plt.tight_layout()
plt.show()

Note: In the visualizations, we’ll see the original time series data, marked with the introduced anomaly. The second subplot displays the reconstruction errors over time. Anomalies, such as the one introduced, will likely exhibit higher reconstruction errors, making them stand out in the plot.

Code explanation

Lines 7–8: Import time and anomaly_index from synthetic data generator file.
Line 22: Load pretrained autoencoder weights, enabling model reusability.
Line 23: Set autoencoder to evaluation mode for consistent inference behavior.
Line 26: Predict on regular time series data using the trained autoencoder.
Line 29: Calculate reconstruction error by comparing original data with predictions.
Lines 32–51: Plot regular time series data and reconstruction errors, marking anomalies.

Applications of anomaly detection

Anomaly detection has widespread applications beyond our example. Here are a few real-world scenarios:

Healthcare: Implementing anomaly detection in medical data can help identify unusual patterns in patient records, facilitating early disease detection.
Educational apps: In kids' learning apps, this approach can be used to detect anomalies in writing. If a child writes the wrong alphabet, the system can generate an alert, promoting effective learning.
Daily life security: Imagine using anomaly detection in your home security system. If someone unfamiliar with the household enters, the system can trigger an alert.

In conclusion, anomaly detection with autoencoders is a powerful tool with diverse applications, contributing to enhanced security, healthcare, and educational experiences. Dive into the code, explore, and unleash the potential of anomaly detection in your projects!

Unlock your potential: Autoencoders series, all in one place!

To deepen your understanding of Autoencoders, explore our series of Answers below:

Introduction to autoencoders using PyTorch
Learn the fundamentals of autoencoders and how to implement them using PyTorch for unsupervised learning tasks.
Anomaly detection with autoencoders
Discover how autoencoders can identify anomalies by learning normal data patterns and flagging deviations.
Image denoising using an autoencoder
Explore how autoencoders can remove noise from images by learning to reconstruct clean versions from noisy inputs.
Image reconstruction with autoencoders
Understand how autoencoders compress and reconstruct images, preserving key features while reducing dimensionality.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

You TubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources

Anomaly detection with autoencoders

Introduction to anomaly detection

Leveraging autoencoders for anomaly detection

Dataset visualization

Designing the autoencoder and training

Code explanation

Testing the autoencoder and visualizing the results

Code explanation

Applications of anomaly detection