Guozhen AIGlobal AI field notes and model intelligence

English translation

ResNet Architecture Explained: Deep Residual Networks

Published:

Category: Neural Networks

Read time: 3 min

Reads: 0

Lesson #7Views are counted together with the original Chinese articleImages are preserved from the source page

Detailed Architecture Diagram of ResNet

The key insight of ResNet is to provide a shorter path for information to flow backward. Skip connections are not mere embellishments—they determine whether deep networks can be trained stably. This article focuses on architecture: first clarify the data flow, core modules, and output layers; only then revisit the underlying formulas or implementation code.

Hands-on Verification Diagram for ResNet Architecture

I will verify whether the input and output channel dimensions match for each residual block—and if they don’t, whether a projection branch is properly implemented. A mismatch here typically triggers an immediate dimensionality error during training.

In the previous article on training techniques for BERT, we discussed how the BERT model leverages its unique architecture and self-supervised learning to extract rich features from massive text corpora—enabling strong performance across diverse downstream tasks. Next, we delve into ResNet, a widely adopted deep learning architecture in computer vision, analyzing its network structure and operational principles.

Introduction to ResNet

ResNet (Residual Network) is a deep convolutional neural network first proposed by Kaiming He et al. in 2015, achieving outstanding results in the ImageNet challenge. Its breakthrough lies in introducing residual learning, enabling the construction of extremely deep networks (e.g., 152+ layers).

ResNet Structural Decision Card

When studying ResNet’s architecture, begin by examining residual blocks, skip connections, identity mappings, and changes in layer depth. Only by understanding how information bypasses complex layers can you grasp why ResNet excels in deep-network scenarios.

Network Architecture

The core idea of ResNet is to address the vanishing gradient and degradation problems encountered when training very deep neural networks—by introducing skip connections. In traditional CNNs, increasing network depth often leads to decreased training accuracy. ResNet overcomes this limitation via the following structural design:

Neural Network Reading Roadmap Card

Before reading “Detailed Architecture of ResNet”, first trace the visual path in the diagram—from problem statement to final result. After reading, cross-check against the main text to confirm whether you could reconstruct the architecture yourself.

Residual Block

The fundamental building unit of ResNet is the residual block. Each block comprises two or three convolutional layers, plus a skip connection linking input to output. Its mathematical formulation is:

H(x)=F(x)+x\mathcal{H}(x) = \mathcal{F}(x) + x

Here, H(x)\mathcal{H}(x) denotes the block’s output, F(x)\mathcal{F}(x) represents the transformation performed by the convolutional layers, and xx is the input to the block. This formulation allows the network to learn the residual (i.e., the difference between desired output and input), rather than directly learning the full mapping.

Key Implementation Code for Residual Blocks

A simple PyTorch implementation of a basic ResNet residual block is shown below:

import torch
import torch.nn as nn

class BasicBlock(nn.Module):
    expansion = 1
    
    def __init__(self, in_channels, out_channels, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.downsample = downsample

    def forward(self, x):
        identity = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        
        if self.downsample is not None:
            identity = self.downsample(x)
        
        out += identity
        out = self.relu(out)
        
        return out

Network Depth Variants

ResNet models come in multiple depths: ResNet-18, ResNet-34, ResNet-50, ResNet-101, and ResNet-152. Deeper variants (e.g., ResNet-50 and above) adopt a bottleneck design to reduce computational cost and parameter count. In these versions, each residual block typically consists of three layers: a 1×1 convolution, followed by a 3×3 convolution, and another 1×1 convolution.

ResNet Architecture Deep-Dive Application Recap Card

If you haven’t fully internalized “Detailed Architecture of ResNet”, revisit this card and walk through its four actionable steps.

ResNet Architecture Deep-Dive Application Check Card

When reviewing “Detailed Architecture of ResNet”, avoid jumping straight into large-scale projects. Instead, start with a single, simple example to verify whether your mental model of the core workflow is clear.

Summary

By introducing residual learning and skip connections, the ResNet architecture significantly alleviates key challenges in training deep networks—enabling greater depth and delivering state-of-the-art performance across numerous vision tasks.

The next article will analyze the strengths and limitations of ResNet, exploring its real-world behavior and potential avenues for improvement. By comparing BERT and ResNet, we gain deeper insight into how deep learning models adapt to distinct application domains.

Continue

Keep reading from here

Browse English site

Reader Messages

Reader messages

Questions, corrections, extra sources, or hands-on results can be left here. No login is required.

Max 800 characters

To reduce spam, each message is checked for length, link count, and posting frequency.

0/800

Messages

0 messages
Loading messages...