FCOS论文及源码详解(二)

FCOS论文及源码详解(二)


FCOS论文及源码详解(一)中,已摘录并大致翻译论文中关于FCOS算法结构的部分,现对FCOS源码进行解析。

FCOS项目

FCOS项目.
其中,有关模型训练的部分说明如下:
Training
The following command line will train FCOS_imprv_R_50_FPN_1x on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD):

python -m torch.distributed.launch \
    --nproc_per_node=8 \
    --master_port=$((RANDOM + 10000)) \
    tools/train_net.py \
    --config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml \
    DATALOADER.NUM_WORKERS 2 \
    OUTPUT_DIR training_dir/fcos_imprv_R_50_FPN_1x

其中关键在于调用 tools/train_net.py
在文件夹中找到这一文件,便是从train_net.py这里开始读代码

FCOS代码

tools/train_net.py

main()函数中关键一句指向train()函数

model = train(cfg, args.local_rank, args.distributed)

train()函数开头则是调用build_detection_model()函数

model = build_detection_model(cfg)

build_detection_model()函数调用自fcos_core.modeling.detector,依次追索至fcos_core.modeling.detector.generalized_rcnn.GeneralizedRCNN,该类继承torch.nn.model,有三个实例化变量

self.backbone = build_backbone(cfg)
self.rpn = build_rpn(cfg, self.backbone.out_channels)
self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels)

build_backbone()函数调用自fcos_core.modeling.backbone
build_rpn()函数调用自fcos_core.modeling.rpn.rpn
build_roi_heads()函数调用自fcos_core.modeling.roi_heads.roi_heads

build_backbone()

首先来看build_backbone()
关键一句

return registry.BACKBONES[cfg.MODEL.BACKBONE.CONV_BODY](cfg)

在fcos_core/config/defaults.py中可找到
cfg.MODEL.BACKBONE.CONV_BODY→_C.MODEL.BACKBONE.CONV_BODY = “R-50-C4”
registry调用自fcos_core.modeling,追索至fcos_core.utils.registry.Registry
Registry()类有如下说明:

A helper class for managing registering modules, it extends a dictionary
    and provides a register functions.

    Eg. creeting a registry:
        some_registry = Registry({"default": default_module})

    There're two ways of registering new modules:
    1): normal way is just calling register function:
        def foo():
            ...
        some_registry.register("foo_module", foo)
    2): used as decorator when declaring the module:
        @some_registry.register("foo_module")
        @some_registry.register("foo_modeul_nickname")
        def foo():
            ...

    Access of module is just like using a dictionary, eg:
        f = some_registry["foo_modeul"]

在build_backbone()函数上方索至

@registry.BACKBONES.register("R-50-C4")
@registry.BACKBONES.register("R-50-C5")
@registry.BACKBONES.register("R-101-C4")
@registry.BACKBONES.register("R-101-C5")
def build_resnet_backbone(cfg):
    body = resnet.ResNet(cfg)
    model = nn.Sequential(OrderedDict([("body", body)]))
    model.out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
    return model

因此build_backbone()→build_resnet_backbone()

resnet.ResNet()

即fcos_core.modeling.backbone.resnet.ResNet()

class ResNet(nn.Module):
    def __init__(self, cfg):
        super(ResNet, self).__init__()

        # If we want to use the cfg in forward(), then we should make a copy
        # of it and store it for later use:
        # self.cfg = cfg.clone()

        # Translate string names to implementations
        stem_module = _STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
        stage_specs = _STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
        transformation_module = _TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]

首先实施stem_module、stage_specs、transformation_module

stem_module
→_STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
→StemWithFixedBatchNorm(BaseStem), norm_func=FrozenBatchNorm2d
→BaseStem

self.conv1 = Conv2d(
    3, out_channels, kernel_size=7, stride=2, padding=3, bias=False
 )
self.bn1 = norm_func(out_channels)

Conv2d追索至fcos_core.layers.misc.Conv2d()

class Conv2d(torch.nn.Conv2d):
    def forward(self, x):
        if x.numel() > 0:
            return super(Conv2d, self).forward(x)
        # get output shape

        output_shape = [
            (i + 2 * p - (di * (k - 1) + 1)) // d + 1
            for i, p, di, k, d in zip(
                x.shape[-2:], self.padding, self.dilation, self.kernel_size, self.stride
            )
        ]
        output_shape = [x.shape[0], self.weight.shape[0]] + output_shape
        return _NewEmptyTensorOp.apply(x, output_shape)

FrozenBatchNorm2d追索至fcos_core.layers.batch_norm.FrozenBatchNorm2d()

   def __init__(self, n):
        super(FrozenBatchNorm2d, self).__init__()
        self.register_buffer("weight", torch.ones(n))
        self.register_buffer("bias", torch.zeros(n))
        self.register_buffer("running_mean", torch.zeros(n))
        self.register_buffer("running_var", torch.ones(n))

    def forward(self, x):
        scale = self.weight * self.running_var.rsqrt()
        bias = self.bias - self.running_mean * scale
        scale = scale.reshape(1, -1, 1, 1)
        bias = bias.reshape(1, -1, 1, 1)
        return x * scale + bias

register_buffer:pytorch.nn.Module的方法
This is typically used to register a buffer that should not to be considered a model parameter.
通常用于注册不应被视为模型参数的缓冲区
rsqrt(): Returns a new tensor with the reciprocal of the square-root of each of the elements of input.
rsqrt()返回每个元素平方根倒数
这个算法目下看不懂,看懂了再来补解释

stage_specs
→_STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
→ResNet50StagesTo4

ResNet50StagesTo4 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 6, True))
)

在其上方索至,即stage_specs定义了各阶段参数(index序号, block_count该阶段剩余块数, return_features是否返回特征图)

StageSpec = namedtuple(
    "StageSpec",
    [
        "index",  # Index of the stage, eg 1, 2, ..,. 5
        "block_count",  # Number of residual blocks in the stage
        "return_features",  # True => return the last feature map from this stage
    ],
)

transformation_module
→_TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]
→BottleneckWithFixedBatchNorm
其中, num_groups=1, stride_in_1x1=True, stride=1, dilation=1, dcn_config=None
→Bottleneck, norm_func=FrozenBatchNorm2d
该类有__init__、forward两个方法,forward和其它算法模型类同故略过不谈

class Bottleneck(nn.Module):
    def __init__(
        self,
        # omit
    ):
        super(Bottleneck, self).__init__()
		self.downsample = None
        if in_channels != out_channels:
            down_stride = stride if dilation == 1 else 1
            self.downsample = nn.Sequential(
                Conv2d(
                    in_channels, out_channels,
                    kernel_size=1, stride=down_stride, bias=False
                ),
                norm_func(out_channels),
            )
            for modules in [self.downsample,]:
                for l in modules.modules():
                    if isinstance(l, Conv2d):
                        nn.init.kaiming_uniform_(l.weight, a=1)

downsample:当输入输出通道数不同时, 利用一卷积层映射

        if dilation > 1:
            stride = 1 # reset to be 1
        stride_1x1, stride_3x3 = (stride, 1) if stride_in_1x1 else (1, stride)

stride_1x1, stride_3x3都为1

		self.conv1 = Conv2d(
            in_channels,
            bottleneck_channels,
            kernel_size=1,
            stride=stride_1x1,
            bias=False,
        )
        self.bn1 = norm_func(bottleneck_channels)

定义第1层卷积层

		with_dcn = dcn_config.get("stage_with_dcn", False)
        if with_dcn:
            # omit
        else:
            self.conv2 = Conv2d(
                bottleneck_channels,
                bottleneck_channels,
                kernel_size=3,
                stride=stride_3x3,
                padding=dilation,
                bias=False,
                groups=num_groups,
                dilation=dilation
            )
            nn.init.kaiming_uniform_(self.conv2.weight, a=1)

        self.bn2 = norm_func(bottleneck_channels)

        self.conv3 = Conv2d(
            bottleneck_channels, out_channels, kernel_size=1, bias=False
        )
        self.bn3 = norm_func(out_channels)

        for l in [self.conv1, self.conv3,]:
            nn.init.kaiming_uniform_(l.weight, a=1)

定义第2、3层卷积层

上一篇:Inceptionv3j详细结构以及代码


下一篇:Pytorch中自定义神经网络卷积核权重