AMAD

„Archivum Medii Aevi Digitale - Interdisziplinäres Open-Access-Fachrepositorium und Wissenschaftsblog für Mittelalterforschung‟
 Zur Einreichung
AMAD BETA logo
Langanzeige der Metadaten
DC ElementWertSprache
MitwirkendeThe Pennsylvania State University CiteSeerX Archives-
Autor*inMarco Passarotti-
Datum2010-
Quellehttp://www.lrec-conf.org/proceedings/lrec2010/pdf/178_Paper.pdf-
Quellehttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.682.1382-
URIhttps://www.amad.org/jspui/handle/123456789/68615-
BeschreibungThe creation of language resources for less-resourced languages like the historical ones benefits from the exploitation of language-independent tools and methods developed over the years by many projects for modern languages. Along these lines, a number of treebanks for historical languages started recently to arise, including treebanks for Latin. Among the Latin treebanks, the Index Thomisticus Treebank is a 68,000 token dependency treebank based on the Index Thomisticus by Roberto Busa SJ, which contains the opera omnia of Thomas Aquinas (118 texts) as well as 61 texts by other authors related to Thomas, for a total of approximately 11 million tokens. In this paper, we describe a number of modifications that we applied to the dependency parser DeSR, in order to improve the parsing accuracy rates on the Index Thomisticus Treebank. First, we adapted the parser to the specific processing of Medieval Latin, defining an ad-hoc configuration of its features. Then, in order to improve the accuracy rates provided by DeSR, we applied a revision parsing method and we combined the outputs produced by different algorithms. This allowed us to improve accuracy rates substantially, reaching results that are well beyond the state of the art of parsing for Latin. 1.-
Formatapplication/pdf-
Spracheeng-
RechteMetadata may be used without restrictions as long as the oai identifier remains attached to it.-
Dewey-Dezimalklassifikation940-
TitelImprovements in Parsing the Index Thomisticus Treebank: Revision, Combination and a Feature Model for Medieval Latin-
Typtext-
AMAD ID568425-
Jahr2010-
Open Access1-
Enthalten in den Sammlungen:BASE (Bielefeld Academic Search Engine)
General history of Europe


Dateien zu dieser Ressource:
Es gibt keine Dateien zu dieser Ressource.


Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.