Model performance

Model training

We trained three probabilistic models ((1) verb lemmas; (2) syntactic frames; (3) verb lemma+syntactic frames)and (4) a transformer model based on RoBERTa embeddings. The fourth model used RoBERTa embeddings to predict whether a word represented the head of a particular ASC. Models were trained using the transformer named entities model in Spacy (version 3.4). Models were developed using the training set data, fine-tuned using the development set data, and finally evaluated on the test set data. Please refer to the paper for the detailed information on other models (1-3).

Transformer model

ASC

P

R

F1

TRAN_S

0.927

0.949

0.938

ATTR

0.989

0.975

0.982

INTRAN_S

0.884

0.837

0.859

PASSIVE

0.878

0.847

0.862

INTRAN_MOT

0.750

0.789

0.769

TRAN_RES

0.802

0.793

0.798

CAUS_MOT

0.731

0.754

0.742

DITRAN

0.878

0.935

0.905

INTRAN_RES

0.846

0.688

0.759

Weighted average

0.917

0.920

0.918

F1 Scores for all models

ASC

Freq

lemma

syntactic frame

lemma+syntactic frame

transformer

TRAN_S

1,253

0.821

0.824

0.897

0.938

ATTR

633

0.982

0.884

0.972

0.982

INTRAN_S

265

0.373

0.617

0.713

0.859

PASSIVE

170

0.283

0.799

0.809

0.862

INTRAN_MOT

95

0.522

0.258

0.540

0.769

TRAN_RES

92

0.397

0.723

0.756

0.798

CAUS_MOT

65

0.301

0.524

0.557

0.742

DITRAN

46

0.536

0.747

0.825

0.905

INTRAN_RES

16

0.519

0.105

0.640

0.759

Weighted average

0.735

0.779

0.862

0.918