百度飞桨比赛心得-论文引用网络节点分类

2024-03-08 14:09:34

百度飞桨-论文引用网络节点分类

博客仅用来记录第一次比赛历程，后来有新的想法，效果会再更新。

比赛的baseline中提供了GCN，GAT，APPNP，SGC，GCNII，五个模型，一开始随便训练，只有GCN和SGC效果不错。
SGC的效果非常不错，很快就训练到了0.72+了。在百度图神经网络训练营中，课上介绍了使用残差网络作用在GAT的方法。在使用这个方法的时候，给我一个体悟就是，虽然RES+GAT一开始训练得很慢，但后劲足，一直训练后来效果越来越好，到达0.73+。
此时，初学者的我就陷入了瓶颈，不断改动dropout，层数等，效果并不明显。这说明有时候，单纯的调高层数（变化不大），效果往往不怎么明显。因为效果往往在很多epoch后才看得出来，所以盲目调参极其费时间。
中间有偿试过labelsmothing，不知道有没有用，调参的时候太急躁，没有控制变量，看不出改动的作用。
模型初期，学习率过大，很难学习。有很多模型需要较小的学习率会有更好的效果。
参数的理解：维度变大，可以存储更多信息。层数加深，拟合，加工数据能力加强。dropout，强化单个神经元的能力。正则化，降低整体权重大小，其实是让更多的权重参与作用到训练中，减轻过拟合。
优化器的选择。
关于管理运行结果和运行参数，应该利用表格进行记录，单纯在文本上改，很容易改完就忘了跑的参数是多少了。
这几个模型的层数，有的层数其实描述的是对节点的特征加工的层数，有的则是图神经网络的层数，应当注意区分，阅读源码get到了k_hop这些参数其实就是在描述图神经网络的层数。
之后的几个尝试，RESAPPNP，RESGCNII，多叠几个GAT，GAT+APPNP。
RESAPPNP，训练速度也快，效果也好，train_acc可以去到8.6+，而val_acc只有7.4+，参数怎么调都解决不了，正则化后面还是过拟合，把模型层数调小会好些，但效果又不怎么样。
GCNII，需要极多内存，效果没发现好在哪里，据说是可以叠多层图神经网络，后续需要看论文。
多个GAT，是我目前使用最好的模型，不会过拟合，效果又好，训练也快，不知道什么道理。
GAT+APPNP，感觉前面稍微抑制了过拟合，后面还是会过拟合，效果也还好。
天开异想之把全部数据丢进去训练分数就会高，其实是会高的，但是没有很好的验证集，难以知道什么停止好，容易过拟合，所以应该统计epoch数，在一定数量的epoch就停止。
没有试过ensemble。
遇到奇怪情况，越训练，训练准确率反而下降。
验证集高，不代表测试集高，所以选了验证集最高还不一定是最好的。
提交次数有限，可以开小号提交，训练。
对我帮助很大的工具：本地的debug，源码，论文)。
总而言之，我还没调到0.75+以上，没有思路，随便改参数模型大概率浪费时间。
论文看得不够多，比赛没有经验，缺乏有效的思路。

典型乱改代码：

    """Implement of APPNP"""
    def __init__(self, config, num_class):
        self.num_class = num_class
        self.num_layers = config.get("num_layers", 1)
        self.hidden_size = config.get("hidden_size", 128)
        self.dropout = config.get("dropout", 0.2)
        self.alpha = config.get("alpha", 0.1)
        self.k_hop = config.get("k_hop", 20)
        self.edge_dropout = config.get("edge_dropout", 0.01)
        self.feat_dropout = config.get("feat_drop", 0.3)
        self.attn_dropout = config.get("attn_drop", 0.3)

    def forward(self, graph_wrapper, feature, phase):
        if phase == "train": 
            edge_dropout = self.edge_dropout
        else:
            edge_dropout = 0
        feature = L.fc(feature, self.hidden_size, name="linear")
        for i in range(self.num_layers):
            res_feature = feature
            feature = L.dropout(
                feature,
                self.dropout,
                dropout_implementation='upscale_in_train')
            feature = L.fc(feature, self.hidden_size, act=None, name="lin%s" % i)
            feature = res_feature + feature
            feature = L.relu(feature)
            feature = L.layer_norm(feature, "ln_%s" % i)

        feature = L.dropout(
            feature,
            self.dropout,
            dropout_implementation='upscale_in_train')

        ngw = pgl.sample.edge_drop(graph_wrapper, edge_dropout)    
        feature = conv.gat(ngw,
                            feature,
                            16,
                            activation="elu",
                            name="gat_layer_%s" % i,
                            num_heads=8,
                            feat_drop=self.feat_dropout,
                            attn_drop=self.attn_dropout) 
        feature = conv.gat(ngw,
                            feature,
                            16,
                            activation="elu",
                            name="gat_layer_%s" % i,
                            num_heads=8,
                            feat_drop=self.feat_dropout,
                            attn_drop=self.attn_dropout) 
        feature = L.fc(feature, self.num_class, act=None, name="output")

        feature = conv.appnp(graph_wrapper,
            feature=feature,
            edge_dropout=edge_dropout,
            alpha=self.alpha,
            k_hop=self.k_hop)
        return feature```

码农公寓

百度飞桨-论文引用网络节点分类

相关文章