Model performance
Model training
We trained three probabilistic models ((1) verb lemmas; (2) syntactic frames; (3) verb lemma+syntactic frames)and (4) a transformer model based on RoBERTa embeddings. The fourth model used RoBERTa embeddings to predict whether a word represented the head of a particular ASC. Models were trained using the transformer named entities model in Spacy (version 3.4). Models were developed using the training set data, fine-tuned using the development set data, and finally evaluated on the test set data. Please refer to the paper for the detailed information on other models (1-3).
Transformer model
ASC |
P |
R |
F1 |
|---|---|---|---|
TRAN_S |
0.927 |
0.949 |
0.938 |
ATTR |
0.989 |
0.975 |
0.982 |
INTRAN_S |
0.884 |
0.837 |
0.859 |
PASSIVE |
0.878 |
0.847 |
0.862 |
INTRAN_MOT |
0.750 |
0.789 |
0.769 |
TRAN_RES |
0.802 |
0.793 |
0.798 |
CAUS_MOT |
0.731 |
0.754 |
0.742 |
DITRAN |
0.878 |
0.935 |
0.905 |
INTRAN_RES |
0.846 |
0.688 |
0.759 |
Weighted average |
0.917 |
0.920 |
0.918 |
F1 Scores for all models
ASC |
Freq |
lemma |
syntactic frame |
lemma+syntactic frame |
transformer |
|---|---|---|---|---|---|
TRAN_S |
1,253 |
0.821 |
0.824 |
0.897 |
0.938 |
ATTR |
633 |
0.982 |
0.884 |
0.972 |
0.982 |
INTRAN_S |
265 |
0.373 |
0.617 |
0.713 |
0.859 |
PASSIVE |
170 |
0.283 |
0.799 |
0.809 |
0.862 |
INTRAN_MOT |
95 |
0.522 |
0.258 |
0.540 |
0.769 |
TRAN_RES |
92 |
0.397 |
0.723 |
0.756 |
0.798 |
CAUS_MOT |
65 |
0.301 |
0.524 |
0.557 |
0.742 |
DITRAN |
46 |
0.536 |
0.747 |
0.825 |
0.905 |
INTRAN_RES |
16 |
0.519 |
0.105 |
0.640 |
0.759 |
Weighted average |
0.735 |
0.779 |
0.862 |
0.918 |