English translation
ResNet Architecture Explained: Deep Residual Networks
The key insight of ResNet is to provide a shorter path for information to flow backward. Skip connections are not mere embellishments—they determine whether deep networks can be trained stably. This article focuses on architecture: first clarify the data flow, core modules, and output layers; only then revisit the underlying formulas or implementation code.
I will verify whether the input and output channel dimensions match for each residual block—and if they don’t, whether a projection branch is properly implemented. A mismatch here typically triggers an immediate dimensionality error during training.
In the previous article on training techniques for BERT, we discussed how the BERT model leverages its unique architecture and self-supervised learning to extract rich features from massive text corpora—enabling strong performance across diverse downstream tasks. Next, we delve into ResNet, a widely adopted deep learning architecture in computer vision, analyzing its network structure and operational principles.
Introduction to ResNet
ResNet (Residual Network) is a deep convolutional neural network first proposed by Kaiming He et al. in 2015, achieving outstanding results in the ImageNet challenge. Its breakthrough lies in introducing residual learning, enabling the construction of extremely deep networks (e.g., 152+ layers).
When studying ResNet’s architecture, begin by examining residual blocks, skip connections, identity mappings, and changes in layer depth. Only by understanding how information bypasses complex layers can you grasp why ResNet excels in deep-network scenarios.
Network Architecture
The core idea of ResNet is to address the vanishing gradient and degradation problems encountered when training very deep neural networks—by introducing skip connections. In traditional CNNs, increasing network depth often leads to decreased training accuracy. ResNet overcomes this limitation via the following structural design:
Before reading “Detailed Architecture of ResNet”, first trace the visual path in the diagram—from problem statement to final result. After reading, cross-check against the main text to confirm whether you could reconstruct the architecture yourself.
Residual Block
The fundamental building unit of ResNet is the residual block. Each block comprises two or three convolutional layers, plus a skip connection linking input to output. Its mathematical formulation is:
Here, denotes the block’s output, represents the transformation performed by the convolutional layers, and is the input to the block. This formulation allows the network to learn the residual (i.e., the difference between desired output and input), rather than directly learning the full mapping.
Key Implementation Code for Residual Blocks
A simple PyTorch implementation of a basic ResNet residual block is shown below:
import torch
import torch.nn as nn
class BasicBlock(nn.Module):
expansion = 1
def __init__(self, in_channels, out_channels, stride=1, downsample=None):
super(BasicBlock, self).__init__()
self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
self.bn1 = nn.BatchNorm2d(out_channels)
self.relu = nn.ReLU(inplace=True)
self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(out_channels)
self.downsample = downsample
def forward(self, x):
identity = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
if self.downsample is not None:
identity = self.downsample(x)
out += identity
out = self.relu(out)
return out
Network Depth Variants
ResNet models come in multiple depths: ResNet-18, ResNet-34, ResNet-50, ResNet-101, and ResNet-152. Deeper variants (e.g., ResNet-50 and above) adopt a bottleneck design to reduce computational cost and parameter count. In these versions, each residual block typically consists of three layers: a 1×1 convolution, followed by a 3×3 convolution, and another 1×1 convolution.
If you haven’t fully internalized “Detailed Architecture of ResNet”, revisit this card and walk through its four actionable steps.
When reviewing “Detailed Architecture of ResNet”, avoid jumping straight into large-scale projects. Instead, start with a single, simple example to verify whether your mental model of the core workflow is clear.
Summary
By introducing residual learning and skip connections, the ResNet architecture significantly alleviates key challenges in training deep networks—enabling greater depth and delivering state-of-the-art performance across numerous vision tasks.
The next article will analyze the strengths and limitations of ResNet, exploring its real-world behavior and potential avenues for improvement. By comparing BERT and ResNet, we gain deeper insight into how deep learning models adapt to distinct application domains.
Continue