Gated Graph Sequence Neural NetworksGated Graph Sequence Neural NetworksLi, Yujia and Tarlow, Daniel and Brockschmidt, Marc and Zemel, Richard S.2015

Paper summaryhlarochelleThis paper presents a feed-forward neural network architecture for processing graphs as inputs, inspired from previous work on Graph Neural Networks.
In brief, the architecture of the GG-NN corresponds to $T$ steps of GRU-like (gated recurrent units) updates, where T is a hyper-parameter. At each step, a vector representation is computed for all nodes in the graph, where a node's representation at step t is computed from the representation of nodes at step $t-1$. Specifically, the representation of a node will be updated based on the representation of its neighbors in the graph. Incoming and outgoing edges in the graph are treated differently by the neural network, by using different parameter matrices for each. Moreover, if edges have labels, separate parameters can be learned for the different types of edges (meaning that edge labels determine the configuration of parameter sharing in the model). Finally, GG-NNs can incorporate node-level attributes, by using them in the initialization (time step 0) of the nodes' representations.
GG-NNs can be used to perform a variety of tasks on graphs. The per-node representations can be used to make per-node predictions by feeding them to a neural network (shared across nodes). A graph-level predictor can also be obtained using a soft attention architecture, where per-node outputs are used as scores into a softmax in order to pool the representations across the graph, and feed this graph-level representation to a neural network. The attention mechanism can be conditioned on a "question" (e.g. on a task to predict the shortest path in a graph, the question would be the identity of the beginning and end nodes of the path to find), which is fed to the node scorer of the soft attention mechanism. Moreover, the authors describe how to chain GG-NNs to go beyond predicting individual labels and predict sequences.
Experiments on several datasets are presented. These include tasks where a single output is required (on a few bAbI tasks) as well as tasks where a sequential output is required, such as outputting the shortest path or the Eulerian circuit of a graph. Moreover, experiments on a much more complex and interesting program verification task are presented.

This paper presents a feed-forward neural network architecture for processing graphs as inputs, inspired from previous work on Graph Neural Networks.
In brief, the architecture of the GG-NN corresponds to $T$ steps of GRU-like (gated recurrent units) updates, where T is a hyper-parameter. At each step, a vector representation is computed for all nodes in the graph, where a node's representation at step t is computed from the representation of nodes at step $t-1$. Specifically, the representation of a node will be updated based on the representation of its neighbors in the graph. Incoming and outgoing edges in the graph are treated differently by the neural network, by using different parameter matrices for each. Moreover, if edges have labels, separate parameters can be learned for the different types of edges (meaning that edge labels determine the configuration of parameter sharing in the model). Finally, GG-NNs can incorporate node-level attributes, by using them in the initialization (time step 0) of the nodes' representations.
GG-NNs can be used to perform a variety of tasks on graphs. The per-node representations can be used to make per-node predictions by feeding them to a neural network (shared across nodes). A graph-level predictor can also be obtained using a soft attention architecture, where per-node outputs are used as scores into a softmax in order to pool the representations across the graph, and feed this graph-level representation to a neural network. The attention mechanism can be conditioned on a "question" (e.g. on a task to predict the shortest path in a graph, the question would be the identity of the beginning and end nodes of the path to find), which is fed to the node scorer of the soft attention mechanism. Moreover, the authors describe how to chain GG-NNs to go beyond predicting individual labels and predict sequences.
Experiments on several datasets are presented. These include tasks where a single output is required (on a few bAbI tasks) as well as tasks where a sequential output is required, such as outputting the shortest path or the Eulerian circuit of a graph. Moreover, experiments on a much more complex and interesting program verification task are presented.