【图解】Pytorch 转置卷积操作

在深度学习模型中,卷积层绝对是最常用的基本操作,因此学习好卷积操作至关重要。卷积运算是线性变换的一种,而且属于一种稀疏连接的线性变换(不同与全连接的线性变换层,其是稠密连接的线性变换)。

卷积操作的运算涉及两个张量

  • 第一个张量是输入张量
  • 第二个是线性变换的权重张量(也称为卷积核 or 滤波器)

在 Pytorch 中,卷积操作主要可以分为两类,第一类是正常的卷积操作,第二类为转置卷积。这两类卷积分别有三个子类,即一维卷积、二维卷积 & 三维卷积。卷积核 & 转置卷积 都有一个公共的父类,即 _ConvNd 类,这个类是隐藏的,具体代码在 torch/nn/modules/conv.py 文件夹下。

_ConvNd 父类

# _ConvNd 父类

class _ConvNd(in_channels, out_channels, kernel_size, stride, padding, 
			  dilation,transposed, out_channels, output_padding, 
			  groups, bias, padding_mode)

  • stride: controls the stride for the cross-correlation.
  • padding: controls the amount of implicit zero-paddings on both sides for d i l a t i o n ∗ − ( k e r n e l s i z e − 1 ) − p a d d i n g dilation * - (kernel_size - 1) - padding dilation∗−(kernels​ize−1)−padding number of points. See note below for details.
  • output_padding: controls the additional size added to one side of the output shape. See note below for details.
  • dilation: controls the spacing between the kernel points; also known as the à trous algorithm. It is harder to describe, but this link has a nice visualization of what dilation does.
  • groups: controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups.

卷积操作

# nn.Conv2d 卷积

class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, groups=1, bias=True)

【图解】Pytorch 转置卷积操作


转置卷积操作

# nn.ConvTranspose2d 反卷积

torch.nn.ConvTranspose2d(
		in_channels   	: int, 
		out_channels  	: int, 
		kernel_size	  	: Union[T, Tuple[T, T]], 
		stride		  	: Union[T, Tuple[T, T]] = 1, 
		padding		  	: Union[T, Tuple[T, T]] = 0, 
		output_padding	: Union[T, Tuple[T, T]] = 0, 
		groups		  	: int = 1, 
		bias		  	: bool = True, 
		dilation  	  	: int = 1, 
		padding_mode  	: str = 'zeros')

【图解】Pytorch 转置卷积操作

【图解】Pytorch 转置卷积操作


上一篇:空洞卷积(dilated convolution)


下一篇:目标分割Dilated Convolutions讲解