Building the Gold Standard for the surface syntax of Basque

Autor
Aduriz, I.; Aranzabe, M. J.; Arriola, J. M.; Díaz de Ilarraza, A.; Gonzalez-Dios, I.; Urizar, R.
Membres autors
Any
2017
Lloc
Procesamiento del lenguaje natural. Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN). 58, pp.125-132
ISBN
ISSN 1135-5948

In this paper, we present the process in the construction of SF-EPEC, a 300,000-word corpus syntactically annotated that aims to be a Gold Standard for the surface syntactic processing of Basque. First, the tagset designed for this purpose is described; being Basque an agglutinative language, sometimes complex syntactic tags were needed. We also account for the different phases in the construction of SF-EPEC.