Diffusion-Convolutional Neural Networks(扩散卷积神经网络)DCNN

论文地址:https://arxiv.org/abs/1511.02136

总结一下就是:

DCNN通过随机游走的概率转移矩阵作为aggregate的function,同时单个节点包含了所有aggregate的不同hip转移矩阵的结果。

随机游走的概率转移矩阵在这里定义为\(P^k={P^1,...,P^h},其中P=D^{-1}A,h表示考虑到几阶邻域\)

对于模型而言,分别考虑节点的1...h阶邻域,聚合方式为sum,则对于第i阶聚合后为\(P^iX\)

然后对\(P^iX\)中每个顶点与相同的卷积核做卷积\(W^k_i*P^iX\in R^{N*N}\)

之后再过一个全连接层(H*N, C)其中C是分类数

实现

1
2
3
4
5
6
7
8
9
10
11
12
13
class DCNN(nn.Module):
def __init__(self, num_jump, in_channels, out_channels):
super().__init__()
self.num_jump = num_jump
self.W = nn.Parameter(torch.FloatTensor(num_jump, in_channels))
self.linear = nn.Linear(num_jump*in_channels, out_channels)
nn.init.xavier_normal_(self.W)

def forward(self, X, A):
D = torch.diag(A.sum(dim=0)**-1)
self.P = torch.stack([(D@A)**(i+1) for i in range(self.num_jump)], dim=1)
o = (self.P@X)*self.W
return self.linear(o.reshape(o.shape[0], -1))

train

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
from torch_geometric.datasets import Planetoid
import matplotlib.pyplot as plt

def edge_index_to_adj(num_nodes, edge_index):
adj = torch.zeros((num_nodes, num_nodes), dtype=torch.float32, device=edge_index.device)
for i,j in edge_index.T:
adj[i, j]=1
adj[j, i]=1
return adj
def accuracy(y_hat, y):
y_hat = y_hat.argmax(dim=1)
score = y_hat[y_hat==y].shape[0]/y.shape[0]
return score

def test(net, cora_data, mask):
net.eval()
adj = edge_index_to_adj(cora_data.num_nodes, cora_data.edge_index)
y_hat = net(cora_data.x, adj)
return accuracy(y_hat[mask], cora_data.y[mask])

def train(net: DCNN, cora_data, device):
epochs = 200
lr, weight_decay = 0.01, 5e-4
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=lr, weight_decay=weight_decay)
adj = edge_index_to_adj(cora_data.num_nodes, cora_data.edge_index)
print(adj.device)
net, cora_data = [i.to(device) for i in [net, cora_data]]

loss_list = []
val_list = []
for e in range(epochs):
net.train()
y_hat = net(cora_data.x, adj)
loss = loss_fn(y_hat[cora_data.train_mask], cora_data.y[cora_data.train_mask])
optimizer.zero_grad()
loss.backward()
optimizer.step()

val_acc = test(net, cora_data, cora_data.val_mask)
loss_list.append(loss.item())
val_list.append(val_acc)
if e%50==0 or e==epochs-1: print(e, ' ', loss.item(), ' ', val_acc)
plt.plot(loss_list)
plt.plot(val_list)
return net

device = torch.device('cuda')
dataset = Planetoid(root='./data/Cora', name='Cora')
cora_data = dataset[0]
net = DCNN(2, cora_data.num_features, dataset.num_classes)
train(net, cora_data, device)

# 输出
cuda:0
0 1.9457284212112427 0.328
50 0.13217289745807648 0.722
100 0.07239208370447159 0.718
150 0.05417220667004585 0.708
199 0.04562362655997276 0.7
测试集精确度为:
test(net, cora_data, cora_data.test_mask)
0.756

论文中的实验数据

但是在train时所占用的数据比重为不是我们这里使用的140个顶点

We also provide learning curves for the CORA and Pubmed datasets. In this experiment, the validation and test set each contain 10% of the nodes, and the amount of training data is varied between 10% and 100% of the remaining nodes.