Skip to content

Latest commit

 

History

History
73 lines (59 loc) · 4.29 KB

dep.md

File metadata and controls

73 lines (59 loc) · 4.29 KB

Dependency Parsing

This document provides examples training, evaluating and predicting with dependency parsers. All files passed through input arguments to load data are CoNLL files.

Identifier Parser Paper Arguments
dep-idx Absolute and relative indexing Strzyz et al. (2019) rel
dep-pos PoS-tag relative indexing Strzyz et al. (2019) gold
dep-bracket Bracketing encoding ($k$-planar) Strzyz et al. (2020) k
dep-bit4 $4$-bit projective encoding Gómez-Rodríguez et al. (2023) proj
dep-bit7 $7$-bit $2$-planar encoding Gómez-Rodríguez et al. (2023)
dep-eager Arc-Eager system Nivre and Fernández-González (2002) stack, buffer, proj
dep-biaffine Biaffine dependency parser Dozat et al. (2016)
dep-hexa Hexa-Tagging Amini et al. (2023) proj

Training

python3 run.py dep-idx -p results/dep-idx-xlnet -c configs/xlnet.ini \
    train --train treebanks/english-ewt/train.conllu \
    --dev treebanks/english-ewt/dev.conllu \
    --test treebanks/english-ewt/test.conllu --num-workers 20
  • PoS-based relative indexing (Strzyz et al., 2019) with BiLSTMs as encoder. Add --gold argument as <specific-args> to not predict the PoS-tags but use the gold annotations.
python3 run.py dep-pos -p results/dep-pos-bilstm -c configs/bilstm.ini \
    train --train treebanks/english-ewt/train.conllu \
    --dev treebanks/english-ewt/dev.conllu \
    --test treebanks/english-ewt/test.conllu --num-workers 20
python3 run.py dep-bracket -k 2 -p results/dep-bracket-xlm -c configs/xlm.ini \
    train --train treebanks/english-ewt/train.conllu \
    --dev treebanks/english-ewt/dev.conllu \
    --test treebanks/english-ewt/test.conllu --num-workers 20
python3 run.py dep-eager --stack 1 --buffer 2 -p results/dep-eager-xlm -c configs/xlm.ini \
    train --train treebanks/english-ewt/train.conllu \
    --dev treebanks/english-ewt/dev.conllu \
    --test treebanks/english-ewt/test.conllu --num-workers 20
python3 run.py dep-hexa -p results/dep-hexa-xlnet -c configs/xlnet.ini --proj head \
    --train treebanks/english-ewt/train.conllu \
    --dev treebanks/english-ewt/dev.conllu \
    --test treebanks/english-ewt/test.conllu --num-workers 20

Evaluation

Evaluate now the trained parser at results/dep-bracket-xlm/parser.pt with the same test file:

python3 run.py dep-bracket -p results/dep-bracket-xlm/parser.pt eval treebanks/english-ewt/test.conllu --batch-size 50

Prediction

Predict with the trained parser at results/dep-hexa-xlnet:

python3 run.py dep-hexa -p results/dep-hexa-xlnet/parser.pt predict \
     treebanks/english-ewt/test.conllu results/dep-hexa-xlnet/pred.conllu --batch-size 50