SlideShare a Scribd company logo
Encoding Linguistic Structures with
Graph Convolutional Networks
Diego Marcheggiani
Joint work with IvanTitov and Joost Bastings
University of Amsterdam
University of Edinburgh
@South England NLP Meetup
Structured (Linguistic) Priors
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
“I voted for Palpatine because he was
most aligned with my values,” she said.
2
Sequence to Sequence
3
[Sutskever et al., 2014]
the black cat
le chat noire <s>
<s> le chat noire
Sequence to Sequence
} Language is not (only) a sequence of words
} We have linguistic knowledge
4
[Sutskever et al., 2014]
the black cat
le chat noire <s>
<s> le chat noire
Sequence to Sequence
} Language is not (only) a sequence of words
} We have linguistic knowledge
Encode structured linguistic knowledge into NN using
Graph Convolutional Networks
5
the black cat
le chat noire <s>
<s> le chat noire
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
Diego Marcheggiani,IvanTitov. In Proceedings of EMNLP, 2017.
Exploiting Semantics in Neural MachineTranslation with Graph Convolutional Networks
Diego Marcheggiani,Joost Bastings,IvanTitov. In Proceedings of NAACL-HLT, 2018.
6
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence
Sequa makes and repairs jet engines.
Sequa makes and repairs jet engines.
7
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
8
Sequa makes and repairs jet engines.
make.01 repair.01
Sequa makes and repairs jet engines.
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Semantic Role Labeling
9
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Semantic Role Labeling
10
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Entity repaired
Repairer
Semantic Role Labeling
11
Semantic Role Labeling
} Only the head of an argument is labeled
} Sequence labeling task for each predicate
} Focus on argument identification and labeling
12
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Entity repaired
Repairer
Semantic Role Labeling
13
Question answering
Narayanan and Harabagiu 2004
Shen and Lapata 2007
Khashabi et al. 2018
Machine translation
Wu and Fung 2009
Aziz et al. 2011
Information extraction
Surdeanu et al. 2003
Christensen et al. 2010
Related work
14
Tutorial on Semantic Role
Labeling at EMNLP 2017
Related work
} SRL systems that use syntax with simple NN architectures
} [FitzGerald et al., 2015]
} [Roth and Lapata,2016]
} Recent models ignore linguistic bias
} [Zhou and Xu, 2014]
} [He et al., 2017]
} [Marcheggiani et al., 2017]
15
Tutorial on Semantic Role
Labeling at EMNLP 2017
Motivations
} Some semantic dependencies are mirrored in the syntactic graph
Sequa makes and repairs jet engines.
creator
creation
SBJ COORD
OBJ
CONJ NMOD
ROOT
16
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
Motivations
} Some semantic dependencies are mirrored in the syntactic graph
} Not all of them – syntax-semantics interface is not trivial
17
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
18
Graph Convolutional Networks (message passing)
Undirected graph
[Gori et al. 2005
Scarselli et al. 2009
Kipf and Welling,2016]
19
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Gori et al. 2005
Scarselli et al. 2009
Kipf and Welling,2016]
20
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Kipf and Welling,2016]
21
hi = ReLU
0
@W0hi +
X
j2N (v)
W1hj
1
A
<latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit>
Neighborhood
Self loop
GCNs Pipeline
Hidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
Initial feature
representation of
nodes
Representation
informed by nodes’
neighborhood
[Kipf and Welling,2016]
…
…
…
22
GCNs Pipeline
Hidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
[Kipf and Welling,2016]
…
…
…
Extend GCNs for syntactic dependency trees
Initial feature
representation of
nodes
Representation
informed by nodes’
neighborhood
23
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
24
Example
Lane disputed those estimates
NMOD
SBJ OBJ
[Marcheggiani andTitov, 2017]
25
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani andTitov, 2017]
26
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
[Marcheggiani andTitov, 2017]
27
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
⇥W (1)
obj 0
⇥
W(1)nm
od0
⇥
W(1)subj0
[Marcheggiani andTitov, 2017]
28
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
⇥W (1)
obj 0
⇥
W(1)nm
od0
⇥
W(1)subj0
[Marcheggiani andTitov, 2017]
29
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥
W
(1)
subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)
obj 0
⇥
W
(1)
nm
od
⇥
W(1)nm
od0
⇥W
(1)
obj
⇥
W(1)subj0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani andTitov, 2017]
30
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥
W
(1)
subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)
obj 0
⇥
W
(1)
nm
od
⇥
W(1)nm
od0
⇥W
(1)
obj
⇥
W(1)subj0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥W
(2)
self
⇥W
(2)
self
⇥W
(2)
self
⇥W
(2)
self
⇥
W
(2)
subj
⇥
W(2)subj0
⇥W (2)
obj 0
⇥W
(2)
obj
⇥
W (2)nm
od
⇥
W
(2)
nm
od
0
Stacking GCNs widens the
syntactic neighborhood
[Marcheggiani andTitov, 2017]
31
Syntactic GCNs
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
[Marcheggiani andTitov, 2017]
32
Syntactic GCNs
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Syntactic neighborhood
[Marcheggiani andTitov, 2017]
33
Syntactic GCNs
Syntactic neighborhood
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
34
Syntactic GCNs
Syntactic neighborhood Self-loop is included in N
Messages are direction and
label specific
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
35
} Overparametrized: one matrix for each label-direction pair
}
Syntactic GCNs
Syntactic neighborhood
W
(k)
L(u,v) = V
(k)
dir(u,v)
Self-loop is included in N
Messages are direction and
label specific
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
36
Edge-wise Gates
} Not all edges are equally important for the final task
[Marcheggiani andTitov, 2017]
37
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
[Marcheggiani andTitov, 2017]
38
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
} Gates decide the“importance” of each message
Lane disputed those estimates
NMOD
SBJ
OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani andTitov, 2017]
39
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
} Gates decide the“importance” of each message
Gates depend on
nodes and edges Lane disputed those estimates
NMOD
SBJ
OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani andTitov, 2017]
40
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
41
Our Model
} Word representation
} Bidirectional LSTM encoder
} GCN Encoder
} Local role classifier
[Marcheggiani andTitov, 2017]
42
Word Representation
} Pretrained word embeddings
} Word embeddings
} POS tag embeddings
} Predicate lemma embeddings
Lane disputed those estimates
word
representation
[Marcheggiani andTitov, 2017]
43
BiLSTM Encoder
} Encode each word with its left and right context
} Stacked BiLSTM
Lane disputed those estimates
word
representation
J layers
BiLSTM
[Marcheggiani andTitov, 2017]
44
GCNs Encoder
} Syntactic GCNs after BiLSTM encoder
} Add syntactic information
} Skip connections
} Longer dependencies are captured
Lane disputed those estimates
word
representation
J layers
BiLSTM
dobj
nmodnsubj
K layers
GCN
[Marcheggiani andTitov, 2017]
45
Semantic Role Classifier
Lane disputed those estimates
word
representation
J layers
BiLSTM
dobj
nmodnsubj
K layers
GCN
A1
Classifier
predicate
representation
candidate argument
representation
} Local log-linear classifier
p(r|ti, tp, l) / exp(Wl,r(ti tp))
46
Experiments
} Data
} CoNLL-2009 dataset - English and Chinese
} F1 evaluation measure
} Model
} Hyperparameters tuned on English development set
} State-of-the-art predicate disambiguation models
[Marcheggiani andTitov, 2017]
47
Ablation Experiments (Dev set)
82.7
83.3
81
82
83
84
85
English SRL w/o predicate disambiguation
BiLSTM GCN
[Marcheggiani andTitov, 2017]
48
75.2
77.1
73
74
75
76
77
78
Chinese SRL w/o predicate disambiguation
BiLSTM GCN
English Test Set
87.3
87.7 87.7
88
86
87
88
89
FitzGerald et al. (2015)
(global)
Roth and Lapata (2016)
(global)
Marcheggiani et al. (2017,
CoNLL) (local)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
49
English Out of Domain
75.2
76.1
77.7
77.2
74
75
76
77
78
FitzGerald et al. (2015)
(global)
Roth and Lapata (2016)
(global)
Marcheggiani et al. (2017,
CoNLL) (local)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
50
English Test Set (Ensemble)
87.7
87.9
89.1
86
87
88
89
90
FitzGerald et al. (2015) (ensemble) Roth and Lapata (2016) (ensemble) Ours (Bi-LSTM + GCN) (ensemble)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
51
Chinese Test Set
77.7
78.6
79.4
82.5
76
77
78
79
80
81
82
83
Zhao et al. (2009) (global) Bjö̈rkelund et al. (2009)
(global)
Roth and Lapata (2016)
(global)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
52
Syntactic Graph Convolutional Networks
53
} Fast and simple
} Can be seamlessly applied to other tasks
Syntactic Graph Convolutional Networks
54
} Fast and simple
} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine Translation
Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an.
In Proceedings of EMNLP, 2017.
Syntactic Graph Convolutional Networks
55
} Fast and simple
} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine Translation
Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an.
In Proceedings of EMNLP, 2017.
Improvements on
English to German and
English to Czech translations
Multi-document Question Answering
56
[De Cao et al., 2018]
• Nodes are entities and edges are co-reference links
• Inference on a graph representing the documents collection
Multi-document Question Answering
57
[De Cao et al., 2018]
Syntactic Graph Convolutional Networks
58
Syntactic Graph Convolutional Networks
59
Syntactic Graph Convolutional Networks
60
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
61
Motivations [Marcheggiani at al., 2018]
62
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
Motivations
SRL helps to generalize over different surface realizations
of the same underlying “meaning”.
[Marcheggiani at al., 2018]
63
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
Motivations
64
Motivations
65
Lost in translation
Related work
} Semantics in statistical MT
} [Wu and Fung,2009]
} [Liu and Gildea, 2010]
} [Aziz et al., 2011]
} ...
} Syntax in neural MT
} [Sennrich and Haddow,2016]
} [Aharoni and Goldberg,2017 ]
} [Bastings et al., 2017]
} …
} Semantics in neural MT
} ???
[Marcheggiani at al., 2018]
66
Predicate-argument encoding
67
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Giver
Thing given
Entity given to
Our Model
} Standard sequence2sequence with attention
} Semantic GCN encoder on top of a bidirectional RNN
} RNN decoder
[Marcheggiani at al., 2018]
68
Our model
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
BiRNN/
CNN
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
<bos> John
John
+
RNN
DECODER
ATTENTION
MECHANISM
[Marcheggiani at al., 2018]
69
Our model
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
BiRNN/
CNN
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
<bos> John
John
+
RNN
DECODER
ATTENTION
MECHANISM
[Marcheggiani at al., 2018]
70
Experiments
} Data
} WMT‘16 English-German dataset (~4.5 million sentence pairs)
} BLEU as evaluation measure
} Model
} Hyperparameters tuned on News Commentary En-De (~226K sentence pairs)
} GRU as RNN
[Marcheggiani at al., 2018]
71
Results
23.3
23.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
72
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
73
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
74
+ 1.2 BLEU
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
Semantics is helpful
[Marcheggiani at al., 2018]
75
+ 1.2 BLEU
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
76
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
77
+ 1.6 BLEU
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
Syntax and
semantics are
complementary
[Marcheggiani at al., 2018]
78
+ 1.6 BLEU
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
[Marcheggiani at al., 2018]
79
BiRNN mistranslates “to” as “nach” (directionality)
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
[Marcheggiani at al., 2018]
80
BiRNN mistranslates “to” as “nach” (directionality)
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
81
BiRNN mistranslates “to” as “nach” (directionality)
Analysis [Marcheggiani at al., 2018]
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
82
Both translations are wrong,
but the BiRNN’s one is grammatically correct
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
83
Both translations are wrong,
but the BiRNN’s one is grammatically correct
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
84
Both translations are wrong,
but the BiRNN’s one is grammatically correct
Conclusion
} GCNs for encoding linguistic structures into NN
} Semantics, coreference, discourse
} Fast
} Cheap
} State-of-the-art model for dependency-based SRL
} First to exploit semantics in NMT
85
Roadmap
86
Including structured bias into neural NLP models
Roadmap
87
Including structured bias into neural NLP models
Low-resource setting
Roadmap
88
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Roadmap
89
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Integrating external knowledge
i.e., knowledge graphs
Roadmap
90
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Integrating external knowledge
i.e., knowledge graphs
Thanks for your attention!
Ad

More Related Content

What's hot (19)

Learning to rankの評価手法
Learning to rankの評価手法Learning to rankの評価手法
Learning to rankの評価手法
Kensuke Mitsuzawa
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...
Tomonari Masada
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
Sean Moran
 
Sara el hassad
Sara el hassadSara el hassad
Sara el hassad
Sara EL HASSAD
 
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency ParsingGraph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Alireza Mohammadshahi
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Cemal Ardil
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
Mehwish Alam
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at Scale
GoDataDriven
 
Link Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: AccuracyLink Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: Accuracy
Holistic Benchmarking of Big Linked Data
 
2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin
Giovanni Ciatto
 
Looking for Invariant Operators in Argumentation
Looking for Invariant Operators in ArgumentationLooking for Invariant Operators in Argumentation
Looking for Invariant Operators in Argumentation
Carlo Taticchi
 
Link Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: EfficiencyLink Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: Efficiency
Holistic Benchmarking of Big Linked Data
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Holistic Benchmarking of Big Linked Data
 
Probabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible WorldsProbabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible Worlds
Fulvio Rotella
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
UKM university
 
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial NetworkSEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
Universitat Politècnica de Catalunya
 
Link Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-OnLink Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-On
Holistic Benchmarking of Big Linked Data
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
CSCJournals
 
Extending OWL with Integrity Constraints
Extending OWL with Integrity ConstraintsExtending OWL with Integrity Constraints
Extending OWL with Integrity Constraints
Jie Bao
 
Learning to rankの評価手法
Learning to rankの評価手法Learning to rankの評価手法
Learning to rankの評価手法
Kensuke Mitsuzawa
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...
Tomonari Masada
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
Sean Moran
 
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency ParsingGraph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Alireza Mohammadshahi
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Cemal Ardil
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
Mehwish Alam
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at Scale
GoDataDriven
 
2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin
Giovanni Ciatto
 
Looking for Invariant Operators in Argumentation
Looking for Invariant Operators in ArgumentationLooking for Invariant Operators in Argumentation
Looking for Invariant Operators in Argumentation
Carlo Taticchi
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Holistic Benchmarking of Big Linked Data
 
Probabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible WorldsProbabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible Worlds
Fulvio Rotella
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
UKM university
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
CSCJournals
 
Extending OWL with Integrity Constraints
Extending OWL with Integrity ConstraintsExtending OWL with Integrity Constraints
Extending OWL with Integrity Constraints
Jie Bao
 

Similar to Encoding Linguistic Structures with Graph Convolutional Networks (20)

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
Nesreen K. Ahmed
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSE
Jianfeng Chen
 
_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf
_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf
_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf
sulimanalwageh
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
cvpaper. challenge
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured Data
Jordan Open Source Association
 
15 unionfind
15 unionfind15 unionfind
15 unionfind
Carlos andré dantas
 
Algorithms, Union Find
Algorithms, Union FindAlgorithms, Union Find
Algorithms, Union Find
Nikita Shpilevoy
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
Taeksoo Kim
 
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
GUANGYUAN PIAO
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service Architectures
Martin Szomszor
 
DSA Report.pdf
DSA Report.pdfDSA Report.pdf
DSA Report.pdf
ChhaviCoachingCenter
 
Chapter 4 Mathematical Functions Character and string
Chapter 4 Mathematical Functions Character and stringChapter 4 Mathematical Functions Character and string
Chapter 4 Mathematical Functions Character and string
AhsirYu
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Codemotion Dubai
 
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
GreeceJS
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
KtonNguyn2
 
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph EmbeddingBootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Nanjing University
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
Jean Ihm
 
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 Co-Learning: Consensus-based Learning for Multi-Agent Systems Co-Learning: Consensus-based Learning for Multi-Agent Systems
Co-Learning: Consensus-based Learning for Multi-Agent Systems
Miguel Rebollo
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
ConnorShorten2
 
Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014
Marius Georgescu
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
Nesreen K. Ahmed
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSE
Jianfeng Chen
 
_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf
_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf
_b65e7611894ba175de27bd14793f894a_15UnionFind.pdf
sulimanalwageh
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
cvpaper. challenge
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured Data
Jordan Open Source Association
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
Taeksoo Kim
 
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
GUANGYUAN PIAO
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service Architectures
Martin Szomszor
 
Chapter 4 Mathematical Functions Character and string
Chapter 4 Mathematical Functions Character and stringChapter 4 Mathematical Functions Character and string
Chapter 4 Mathematical Functions Character and string
AhsirYu
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Codemotion Dubai
 
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
GreeceJS
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
KtonNguyn2
 
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph EmbeddingBootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Nanjing University
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
Jean Ihm
 
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 Co-Learning: Consensus-based Learning for Multi-Agent Systems Co-Learning: Consensus-based Learning for Multi-Agent Systems
Co-Learning: Consensus-based Learning for Multi-Agent Systems
Miguel Rebollo
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
ConnorShorten2
 
Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014
Marius Georgescu
 
Ad

Recently uploaded (20)

Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdfGoogle DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
derrickjswork
 
Sustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraaSustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraa
03ANMOLCHAURASIYA
 
How Top Companies Benefit from Outsourcing
How Top Companies Benefit from OutsourcingHow Top Companies Benefit from Outsourcing
How Top Companies Benefit from Outsourcing
Nascenture
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural NetworksDistributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Ivan Ruchkin
 
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Vasileios Komianos
 
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More MachinesRefactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Leon Anavi
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdfGoogle DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
derrickjswork
 
Sustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraaSustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraa
03ANMOLCHAURASIYA
 
How Top Companies Benefit from Outsourcing
How Top Companies Benefit from OutsourcingHow Top Companies Benefit from Outsourcing
How Top Companies Benefit from Outsourcing
Nascenture
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural NetworksDistributionally Robust Statistical Verification with Imprecise Neural Networks
Distributionally Robust Statistical Verification with Imprecise Neural Networks
Ivan Ruchkin
 
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Vasileios Komianos
 
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More MachinesRefactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Leon Anavi
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Ad

Encoding Linguistic Structures with Graph Convolutional Networks

  • 1. Encoding Linguistic Structures with Graph Convolutional Networks Diego Marcheggiani Joint work with IvanTitov and Joost Bastings University of Amsterdam University of Edinburgh @South England NLP Meetup
  • 2. Structured (Linguistic) Priors Sequa makes and repairs jet engines. creator creation entity repaired repairer SBJ COORD OBJ CONJ NMOD ROOT “I voted for Palpatine because he was most aligned with my values,” she said. 2
  • 3. Sequence to Sequence 3 [Sutskever et al., 2014] the black cat le chat noire <s> <s> le chat noire
  • 4. Sequence to Sequence } Language is not (only) a sequence of words } We have linguistic knowledge 4 [Sutskever et al., 2014] the black cat le chat noire <s> <s> le chat noire
  • 5. Sequence to Sequence } Language is not (only) a sequence of words } We have linguistic knowledge Encode structured linguistic knowledge into NN using Graph Convolutional Networks 5 the black cat le chat noire <s> <s> le chat noire
  • 6. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling Diego Marcheggiani,IvanTitov. In Proceedings of EMNLP, 2017. Exploiting Semantics in Neural MachineTranslation with Graph Convolutional Networks Diego Marcheggiani,Joost Bastings,IvanTitov. In Proceedings of NAACL-HLT, 2018. 6
  • 7. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence Sequa makes and repairs jet engines. Sequa makes and repairs jet engines. 7
  • 8. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates 8 Sequa makes and repairs jet engines. make.01 repair.01 Sequa makes and repairs jet engines.
  • 9. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Semantic Role Labeling 9
  • 10. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Semantic Role Labeling 10
  • 11. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Entity repaired Repairer Semantic Role Labeling 11
  • 12. Semantic Role Labeling } Only the head of an argument is labeled } Sequence labeling task for each predicate } Focus on argument identification and labeling 12 Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Entity repaired Repairer
  • 13. Semantic Role Labeling 13 Question answering Narayanan and Harabagiu 2004 Shen and Lapata 2007 Khashabi et al. 2018 Machine translation Wu and Fung 2009 Aziz et al. 2011 Information extraction Surdeanu et al. 2003 Christensen et al. 2010
  • 14. Related work 14 Tutorial on Semantic Role Labeling at EMNLP 2017
  • 15. Related work } SRL systems that use syntax with simple NN architectures } [FitzGerald et al., 2015] } [Roth and Lapata,2016] } Recent models ignore linguistic bias } [Zhou and Xu, 2014] } [He et al., 2017] } [Marcheggiani et al., 2017] 15 Tutorial on Semantic Role Labeling at EMNLP 2017
  • 16. Motivations } Some semantic dependencies are mirrored in the syntactic graph Sequa makes and repairs jet engines. creator creation SBJ COORD OBJ CONJ NMOD ROOT 16
  • 17. Sequa makes and repairs jet engines. creator creation entity repaired repairer SBJ COORD OBJ CONJ NMOD ROOT Motivations } Some semantic dependencies are mirrored in the syntactic graph } Not all of them – syntax-semantics interface is not trivial 17
  • 18. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 18
  • 19. Graph Convolutional Networks (message passing) Undirected graph [Gori et al. 2005 Scarselli et al. 2009 Kipf and Welling,2016] 19
  • 20. Graph Convolutional Networks (message passing) Undirected graph Update of the blue node [Gori et al. 2005 Scarselli et al. 2009 Kipf and Welling,2016] 20
  • 21. Graph Convolutional Networks (message passing) Undirected graph Update of the blue node [Kipf and Welling,2016] 21 hi = ReLU 0 @W0hi + X j2N (v) W1hj 1 A <latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit> Neighborhood Self loop
  • 22. GCNs Pipeline Hidden layer Hidden layer Input Output X = H(0) H(1) H(2) Z = H(n) Initial feature representation of nodes Representation informed by nodes’ neighborhood [Kipf and Welling,2016] … … … 22
  • 23. GCNs Pipeline Hidden layer Hidden layer Input Output X = H(0) H(1) H(2) Z = H(n) [Kipf and Welling,2016] … … … Extend GCNs for syntactic dependency trees Initial feature representation of nodes Representation informed by nodes’ neighborhood 23
  • 24. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 24
  • 25. Example Lane disputed those estimates NMOD SBJ OBJ [Marcheggiani andTitov, 2017] 25
  • 26. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) [Marcheggiani andTitov, 2017] 26
  • 27. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj [Marcheggiani andTitov, 2017] 27
  • 28. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj ⇥W (1) obj 0 ⇥ W(1)nm od0 ⇥ W(1)subj0 [Marcheggiani andTitov, 2017] 28
  • 29. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj ⇥W (1) obj 0 ⇥ W(1)nm od0 ⇥ W(1)subj0 [Marcheggiani andTitov, 2017] 29
  • 30. Example ⇥W (1) self Lane disputed those estimates NMOD SBJ OBJ ⇥ W (1) subj ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) obj 0 ⇥ W (1) nm od ⇥ W(1)nm od0 ⇥W (1) obj ⇥ W(1)subj0 ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) [Marcheggiani andTitov, 2017] 30
  • 31. Example ⇥W (1) self Lane disputed those estimates NMOD SBJ OBJ ⇥ W (1) subj ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) obj 0 ⇥ W (1) nm od ⇥ W(1)nm od0 ⇥W (1) obj ⇥ W(1)subj0 ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥W (2) self ⇥W (2) self ⇥W (2) self ⇥W (2) self ⇥ W (2) subj ⇥ W(2)subj0 ⇥W (2) obj 0 ⇥W (2) obj ⇥ W (2)nm od ⇥ W (2) nm od 0 Stacking GCNs widens the syntactic neighborhood [Marcheggiani andTitov, 2017] 31
  • 32. Syntactic GCNs h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A [Marcheggiani andTitov, 2017] 32
  • 33. Syntactic GCNs h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Syntactic neighborhood [Marcheggiani andTitov, 2017] 33
  • 34. Syntactic GCNs Syntactic neighborhood h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 34
  • 35. Syntactic GCNs Syntactic neighborhood Self-loop is included in N Messages are direction and label specific h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 35
  • 36. } Overparametrized: one matrix for each label-direction pair } Syntactic GCNs Syntactic neighborhood W (k) L(u,v) = V (k) dir(u,v) Self-loop is included in N Messages are direction and label specific h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 36
  • 37. Edge-wise Gates } Not all edges are equally important for the final task [Marcheggiani andTitov, 2017] 37
  • 38. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax [Marcheggiani andTitov, 2017] 38
  • 39. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax } Gates decide the“importance” of each message Lane disputed those estimates NMOD SBJ OBJ ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) g g g g g g g g g g [Marcheggiani andTitov, 2017] 39
  • 40. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax } Gates decide the“importance” of each message Gates depend on nodes and edges Lane disputed those estimates NMOD SBJ OBJ ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) g g g g g g g g g g [Marcheggiani andTitov, 2017] 40
  • 41. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 41
  • 42. Our Model } Word representation } Bidirectional LSTM encoder } GCN Encoder } Local role classifier [Marcheggiani andTitov, 2017] 42
  • 43. Word Representation } Pretrained word embeddings } Word embeddings } POS tag embeddings } Predicate lemma embeddings Lane disputed those estimates word representation [Marcheggiani andTitov, 2017] 43
  • 44. BiLSTM Encoder } Encode each word with its left and right context } Stacked BiLSTM Lane disputed those estimates word representation J layers BiLSTM [Marcheggiani andTitov, 2017] 44
  • 45. GCNs Encoder } Syntactic GCNs after BiLSTM encoder } Add syntactic information } Skip connections } Longer dependencies are captured Lane disputed those estimates word representation J layers BiLSTM dobj nmodnsubj K layers GCN [Marcheggiani andTitov, 2017] 45
  • 46. Semantic Role Classifier Lane disputed those estimates word representation J layers BiLSTM dobj nmodnsubj K layers GCN A1 Classifier predicate representation candidate argument representation } Local log-linear classifier p(r|ti, tp, l) / exp(Wl,r(ti tp)) 46
  • 47. Experiments } Data } CoNLL-2009 dataset - English and Chinese } F1 evaluation measure } Model } Hyperparameters tuned on English development set } State-of-the-art predicate disambiguation models [Marcheggiani andTitov, 2017] 47
  • 48. Ablation Experiments (Dev set) 82.7 83.3 81 82 83 84 85 English SRL w/o predicate disambiguation BiLSTM GCN [Marcheggiani andTitov, 2017] 48 75.2 77.1 73 74 75 76 77 78 Chinese SRL w/o predicate disambiguation BiLSTM GCN
  • 49. English Test Set 87.3 87.7 87.7 88 86 87 88 89 FitzGerald et al. (2015) (global) Roth and Lapata (2016) (global) Marcheggiani et al. (2017, CoNLL) (local) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 49
  • 50. English Out of Domain 75.2 76.1 77.7 77.2 74 75 76 77 78 FitzGerald et al. (2015) (global) Roth and Lapata (2016) (global) Marcheggiani et al. (2017, CoNLL) (local) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 50
  • 51. English Test Set (Ensemble) 87.7 87.9 89.1 86 87 88 89 90 FitzGerald et al. (2015) (ensemble) Roth and Lapata (2016) (ensemble) Ours (Bi-LSTM + GCN) (ensemble) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 51
  • 52. Chinese Test Set 77.7 78.6 79.4 82.5 76 77 78 79 80 81 82 83 Zhao et al. (2009) (global) Bjö̈rkelund et al. (2009) (global) Roth and Lapata (2016) (global) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 52
  • 53. Syntactic Graph Convolutional Networks 53 } Fast and simple } Can be seamlessly applied to other tasks
  • 54. Syntactic Graph Convolutional Networks 54 } Fast and simple } Can be seamlessly applied to other tasks Graph Convolutional Encoders for Syntax-aware Machine Translation Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an. In Proceedings of EMNLP, 2017.
  • 55. Syntactic Graph Convolutional Networks 55 } Fast and simple } Can be seamlessly applied to other tasks Graph Convolutional Encoders for Syntax-aware Machine Translation Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an. In Proceedings of EMNLP, 2017. Improvements on English to German and English to Czech translations
  • 56. Multi-document Question Answering 56 [De Cao et al., 2018] • Nodes are entities and edges are co-reference links • Inference on a graph representing the documents collection
  • 61. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 61
  • 62. Motivations [Marcheggiani at al., 2018] 62 John gave his wonderful wife a nice present . Giver Thing given Entity given to John gave a nice present to his wonderful wife . Giver Entity given to Thing given
  • 63. Motivations SRL helps to generalize over different surface realizations of the same underlying “meaning”. [Marcheggiani at al., 2018] 63 John gave his wonderful wife a nice present . Giver Thing given Entity given to John gave a nice present to his wonderful wife . Giver Entity given to Thing given
  • 66. Related work } Semantics in statistical MT } [Wu and Fung,2009] } [Liu and Gildea, 2010] } [Aziz et al., 2011] } ... } Syntax in neural MT } [Sennrich and Haddow,2016] } [Aharoni and Goldberg,2017 ] } [Bastings et al., 2017] } … } Semantics in neural MT } ??? [Marcheggiani at al., 2018] 66
  • 67. Predicate-argument encoding 67 John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself Giver Thing given Entity given to
  • 68. Our Model } Standard sequence2sequence with attention } Semantic GCN encoder on top of a bidirectional RNN } RNN decoder [Marcheggiani at al., 2018] 68
  • 69. Our model John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself BiRNN/ CNN Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself <bos> John John + RNN DECODER ATTENTION MECHANISM [Marcheggiani at al., 2018] 69
  • 70. Our model John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself BiRNN/ CNN Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself <bos> John John + RNN DECODER ATTENTION MECHANISM [Marcheggiani at al., 2018] 70
  • 71. Experiments } Data } WMT‘16 English-German dataset (~4.5 million sentence pairs) } BLEU as evaluation measure } Model } Hyperparameters tuned on News Commentary En-De (~226K sentence pairs) } GRU as RNN [Marcheggiani at al., 2018] 71
  • 72. Results 23.3 23.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 72
  • 73. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 73
  • 74. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 74 + 1.2 BLEU
  • 75. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU Semantics is helpful [Marcheggiani at al., 2018] 75 + 1.2 BLEU
  • 76. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 76
  • 77. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 77 + 1.6 BLEU
  • 78. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU Syntax and semantics are complementary [Marcheggiani at al., 2018] 78 + 1.6 BLEU
  • 79. Analysis John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE [Marcheggiani at al., 2018] 79 BiRNN mistranslates “to” as “nach” (directionality)
  • 80. Analysis John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE [Marcheggiani at al., 2018] 80 BiRNN mistranslates “to” as “nach” (directionality)
  • 81. John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE 81 BiRNN mistranslates “to” as “nach” (directionality) Analysis [Marcheggiani at al., 2018]
  • 82. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 82 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 83. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 83 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 84. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 84 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 85. Conclusion } GCNs for encoding linguistic structures into NN } Semantics, coreference, discourse } Fast } Cheap } State-of-the-art model for dependency-based SRL } First to exploit semantics in NMT 85
  • 86. Roadmap 86 Including structured bias into neural NLP models
  • 87. Roadmap 87 Including structured bias into neural NLP models Low-resource setting
  • 88. Roadmap 88 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level
  • 89. Roadmap 89 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level Integrating external knowledge i.e., knowledge graphs
  • 90. Roadmap 90 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level Integrating external knowledge i.e., knowledge graphs Thanks for your attention!
  翻译: