Introduction to graph neural networks (GNN)

General design principles details are provided below.

Identifying the graph structure

First, we have to determine the graph structure in the application. Typically, there are two possibilities:

In structural scenarios, the graph structure is explicit in the applications, such as applications on molecules, physical systems, and knowledge graphs.
In non-structural scenarios, graphs are implicit. We must first construct the graph from the task, such as generating a fully-connected "word" graph for text or a scene graph for an image.

After we obtain the graph, the subsequent design process aims to determine the best GNN model for this specific graph.

Determining the type and scale of the graph

Complex graphs could provide more information about nodes and their connections.

Graphs are typically classified as:

Directed/undirected graphs
Static/dynamic graphs
Homogeneous/heterogeneous graphs

Note: These categories are orthogonal, which means they can be combined. For instance, one can deal with a dynamic directed heterogeneous graph. There are various other graph types developed for specific purposes, such as signed graphs and hypergraphs.

There is no clear benchmark regarding graph scale to determine a "small" and "large" graph. The criterion is still evolving with the advancement of computation devices, for example, the speed and memory of GPUs.

Designing the loss function

The loss function is based on task type and the training setting. There are typically three types of graph learning tasks:

Node-level tasks emphasize nodes and include node classification, node regression, and node clustering. Node classification attempts to classify nodes into different groups, whereas node regression predicts a continuous value for each node. Node clustering attempts to divide nodes into numerous distinct groups, with identical nodes placed in the same group.
Edge-level tasks include edge classification and link prediction, which require the model to categorize edge types or predict whether an edge exists between two given nodes.
Graph-level tasks include graph classification, regression, and matching, which require the model to learn graph representations.

From a supervision aspect, we may classify graph learning tasks into three distinct training settings:

Supervised setting
Semi-supervised setting
Unsupervised setting

We can design a specific loss function for the task based on the task type and training setting. For instance, the cross-entropy loss can be employed for the labeled nodes in the training set of a semi-supervised classification task at the node level.

Building the model using computational modules

Building the model requires computational modules. Some examples of commonly used computational modules include:

The propagation module sends data across nodes so that the aggregated data can include both feature and topological information. The convolution and recurrent operators are typically employed in propagation modules to aggregate information from neighbors. In contrast, the skip connection operation is utilized to acquire information from past representations of nodes and minimize the over-smoothing problem.
When graphs are large, sample modules are typically required for graph propagation. The sample module is frequently coupled with the propagation module.
Pooling modules are essential for extracting information from nodes when we need representations of high-level subgraphs or graphs.

These computational modules are often combined to build a typical GNN model.

Algorithm

The algorithm described below is presented in the initial proposal of GNN and is typically referred as the original GNN.

Each node $v$ in the node classification problem is identified by its attributes $x_v$ and linked with a ground-truth label $t_v$ . The goal is to use the labeled nodes in a partially labeled graph $G$ to predict the labels of the unlabeled nodes. It learns to represent each node with a d-dimensional vector (state) $h_v$ containing information about its neighbors. Specifically, it is expressed as the following:

Here, $h_ne[v]$ represents the embedding of the neighboring nodes of $v$ , $x_ne[v]$ refers to the features of the adjacent nodes of $v$ , and $x_co[v]$ denotes the features of the edges connecting with $v$ .

The transition function f is responsible for projecting these inputs into a d-dimensional space. Because we are searching for a unique solution for $h_v$ , we can apply the Banach fixed point theorem to formulate the following equation as an iteratively updating process. This operation is also known as message passing or neighborhood aggregation.

Real-time applications

Many real-time applications of GNN have appeared since its inception. Here are a few of the most notable:

GNN can help with a variety of natural language processing tasks, including sentiment classification, text classification, and sequence labeling. They are used in NLP for the sake of convenience. They are used in social network analysis to forecast similar postings and give users appropriate content recommendations.
Computer vision is a broad topic that has grown rapidly in recent years due to the application of Deep Learning in areas such as image classification and object detection. Convolutional neural networks are the most commonly utilized application. Even though GNN applications in computer vision are still in their infancy, they have immense potential in the following years.
Another scientific application of GNNs is predicting pharmaceutical adverse effects and disease categorization. GNNs are also being used to investigate the structure of chemical and molecular graphs.

GNN has a wide range of functions in addition to those mentioned earlier. GNN has been tested in only two areas: recommender systems and social network research.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources