Model performance

Model training

We trained three probabilistic models ((1) verb lemmas; (2) syntactic frames; (3) verb lemma+syntactic frames)and (4) a transformer model based on RoBERTa embeddings. The fourth model used RoBERTa embeddings to predict whether a word represented the head of a particular ASC. Models were trained using the transformer named entities model in Spacy (version 3.4). Models were developed using the training set data, fine-tuned using the development set data, and finally evaluated on the test set data. Please refer to the paper for the detailed information on other models (1-3).

Transformer model

ASC	P	R	F1
TRAN_S	0.927	0.949	0.938
ATTR	0.989	0.975	0.982
INTRAN_S	0.884	0.837	0.859
PASSIVE	0.878	0.847	0.862
INTRAN_MOT	0.750	0.789	0.769
TRAN_RES	0.802	0.793	0.798
CAUS_MOT	0.731	0.754	0.742
DITRAN	0.878	0.935	0.905
INTRAN_RES	0.846	0.688	0.759
Weighted average	0.917	0.920	0.918

F1 Scores for all models

ASC	Freq	lemma	syntactic frame	lemma+syntactic frame	transformer
TRAN_S	1,253	0.821	0.824	0.897	0.938
ATTR	633	0.982	0.884	0.972	0.982
INTRAN_S	265	0.373	0.617	0.713	0.859
PASSIVE	170	0.283	0.799	0.809	0.862
INTRAN_MOT	95	0.522	0.258	0.540	0.769
TRAN_RES	92	0.397	0.723	0.756	0.798
CAUS_MOT	65	0.301	0.524	0.557	0.742
DITRAN	46	0.536	0.747	0.825	0.905
INTRAN_RES	16	0.519	0.105	0.640	0.759
Weighted average		0.735	0.779	0.862	0.918