Logo of the Institute of Formal and Applied Linguistics

Supporting Universal Dependencies in Tree Editor TrEd

E. Choroba

Charles University in Prague

Faculty of Mathematics and Physics

Institute of Formal and Applied Linguistics

choroba@matfyz.cz

CHOROBA at MetaCPAN

Supporting Universal Dependencies in Tree Editor TrEd

Outline

  1. NLP
  2. What are Universal Dependencies?
  3. What is TrEd?
  4. How do they play together?

Natural Language Processing

Natural Language Processing

The Difference between a Corpus and a Treebank

Corpus Treebank
  • Collection of digital(-ised) texts
  • Often annotated: e.g. PoS tagging
  • Syntactically annotated corpus
  • Phrase or dependency structures

Phrase versus Dependency Trees

Phrase Dependency
Used in UD.

Universal Dependencies

UD—The Community

Forest

UD—Morphology

UD—Morphology

UD—Morphology

UD—Morphology

UD—Morphology

UD—Syntax

UD—Syntax (Example)

The cat could have chased all the dogs down the street.

UD—Enhanced Dependencies

UD—Enhanced Dependencies (2)

UD—Enhanced Dependencies (Example 1)

Is that Microwave that you gave Dan really expensive?

UD—Enhanced Dependencies (Example 2)

At least you get to go to Florida in JANUARY.

UD—Data Format

Tab separated values

# sent_id = Example-2
# text = Is that Microwave that you gave Dan really expensive?
# ID FORM LEMMA UPOS XPOS FEATS HEAD DEPREL DEPS MISC
1 Is be AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 9 cop 9:cop _
2 that that DET DT Number=Sing|PronType=Dem 3 det 3:det _
3 Microwave microwave NOUN NN Number=Sing 9 nsubj 6:obj|9:nsubj _
4 that that PRON WDT PronType=Rel 6 obj 3:ref|6:obj _
5 you you PRON PRP Case=Nom|Person=2|PronType=Prs 6 nsubj 6:nsubj _
6 gave give VERB VBD Mood=Ind|Number=Sing|Person=2|Tense=Past|VerbForm=Fin 3 acl:relcl 3:acl:relcl Cxn=rc-that-obj
7 Dan Dan PROPN NNP Number=Sing 6 iobj 6:iobj _
8 really really ADV RB _ 9 advmod 9:advmod _
9 expensive expensive ADJ JJ Degree=Pos 0 root 0:root SpaceAfter=No
10 ? ? PUNCT . _ 9 punct 9:punct _

Tree Editor TrEd

TrEd—Screenshot

TrEd—Batch Version

TrEd—Data Format

Prague Markup Language

TrEd—Data Format

Searching

PML Tree Query

Tree with a question mark

Searching (2)

PML-TQ: The Query Language

Searching (3)

PML-TQ: The Client Interfaces

  1. GUI client with an interactive query builder in TrEd
  2. CLI
  3. Web interface with some assistance for query building

Searching (3)

PML-TQ: The Engines

  1. Perl
  2. SQL

Searching (4)

PML-TQ: An example

ud.node [
    deprel = "nsubj",
    parent ud.node $p := [
        upostag != "VERB"
    ]
];
>> give $p.upostag
>> for $1
   give $1, count($1)
   sort by $2
PUNCT 2
X 9
PART 11
INTJ 26
DET 54
ADP 130
SYM 131
NUM 299
PROPN 605
ADV 735
AUX 929
PRON 1264
NOUN 5622
ADJ 7169

TrEd Extensions

TrEd plugins

UD in TrEd

UD in TrEd: Stylesheet Example

Вперше за останні півроку працюю з 11-ї і встаю, коли на вулиці вже сонце.

Conclusion

UD in TrEd

Questions

Thank you

https://e-choroba.eu/24-tred