Open In App

Introduction to Residual Networks

Last Updated : 07 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Recent years have seen tremendous progress in the field of Image Processing and Recognition. Deep Neural Networks are becoming deeper and more complex. It has been proved that adding more layers to a Neural Network can make it more robust for image-related tasks. But it can also cause them to lose accuracy. That's where Residual Networks come into place.

The tendency to add so many layers by deep learning practitioners is to extract important features from complex images. So, the first layers may detect edges, and the subsequent layers at the end may detect recognizable shapes, like tires of a car. But if we add more than 30 layers to the network, then its performance suffers and it attains a low accuracy. This is contrary to the thinking that the addition of layers will make a neural network better. This is not due to overfitting, because in that case, one may use dropout and regularization techniques to solve the issue altogether. It's mainly present because of the popular vanishing gradient problem.

Core Idea of Residual Networks

Residual Networks were introduced to tackle the vanishing gradient problem. The key innovation behind ResNet is the concept of residual learning, which allows the network to bypass some layers using skip connections or identity connections.

The underlying idea is captured by the following equation:

y = F(x) + x

In this equation:

  • F(x) represents the output from a series of convolutional layers.
  • x is the input to the residual block.

This addition is what gives rise to the term "residual learning." ResNet, the first popular implementation of this concept, was able to build very deep networks, with the ResNet-152 model featuring 152 layers, and it won the 2015 ILSVRC ImageNet competition. Remarkably, this was achieved with fewer parameters than the earlier VGG19 model, demonstrating the efficiency of residual learning.

Architecture of Residual Networks

A residual network consists of residual units or blocks connected by skip connections (also called identity connections). These skip connections allow the output from the previous layer to be added directly to the output of a subsequent layer, bypassing intermediate layers. This concept is illustrated in the diagram below:

  • The output from the previous layer is added to the output of the subsequent layer in the residual block.
  • The "skip" can span one, two, or even three layers.
  • Since the convolution operation might alter the dimensions of the input, a 1x1 convolution layer is added to adjust the dimensions of the input before adding it to the output.

Each residual block typically consists of:

  1. A 3x3 convolution layer followed by a batch normalization layer and a ReLU activation.
  2. Another 3x3 convolution layer and a batch normalization layer.
  3. The skip connection skips both convolution layers and adds the input directly before the ReLU activation function.

These residual blocks are stacked to form a deeper residual network. The ResNet architecture’s efficiency has been proven through extensive comparison with other Convolutional Neural Network (CNN) architectures. Among all the CNNs tested, ResNet stood out by achieving the lowest top-5 error rate of 3.57% on classification tasks, surpassing other architectures and even humans in some cases.

ResNet vs Other Architectures

When comparing the performance of a 34-layer ResNet with a 34-layer VGG19 and a plain network, ResNet consistently outperforms the others in terms of accuracy, even as the number of layers increases. This is a significant improvement over traditional deep networks, where adding more layers could cause the performance to degrade.

Comparison of 34 layer ResNet with VGG19 and a 34 layer plain network:

ResNet
ResNet Architecture

In fact, ResNet demonstrated a substantial advantage in handling the vanishing gradient and exploding gradient problems that typically arise when deepening a network. By adding identity shortcuts or skip connections, ResNet facilitates smoother gradient flow during backpropagation, which helps in training very deep networks.

Practical Uses of ResNet

Although a ResNet model with thousands of layers is not commonly used in practice due to computational constraints and diminishing returns, ResNet has revolutionized deep learning by providing a framework for building highly effective deep networks. ResNet architectures have proven to be powerful for both image classification and object detection, and their ability to scale without losing performance makes them an ideal choice for a variety of applications.

The below graphs compare the accuracies of a plain network with that of a residual network. Note that with increasing layers a 34-layer plain network's accuracy starts to saturate earlier than ResNet's accuracy.


Next Article
Article Tags :
Practice Tags :

Similar Reads

  翻译: