Towards a historical treebank of Middle and Modern Welsh
Keywords:Middle Welsh language, Historical corpora, Historical syntax
AbstractThis article examines various issues involved in constructing a parsed Penn-style representative historical corpus of Middle and Modern Welsh. Specifically, it focuses on what structures to adopt for constituency-based structural descriptions in three case studies: (i) whether to adopt rel- atively more or less hierarchical structures at the phrasal level and above; (ii) how to deal with complex prepositional phrases, typically containing a grammaticalizing or grammaticalized noun as one of their elements; and (iii) how to deal with coordination of main clauses and omission of elements shared between clauses. In each case, we see how conventions need to be adopted that facilitate maximal ease of searching for potential users of the corpus; that are robust across many centuries of language change; and that permit efficient and consistent parsing by a team of annotators.
Annotating Historical Corpora special issue
Articles appearing in Journal of Historical Syntax are published under a Creative Commons Attribution License. Authors retain copyright.