The Szeged Treebank

Dóra Csendes, J. Csirik, T. Gyimóthy, András Kocsor

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Citations (Scopus)

Abstract

The major aim of the Szeged Treebank project was to create a high-quality database of syntactic structures for Hungarian that can serve as a golden standard to further research in linguistics and computational language processing. The treebank currently contains full syntactic parsing of about 82,000 sentences, which is the result of accurate manual annotation. Current paper describes the linguistic theory as well as the actual method used in the annotation process. In addition, the application of the treebank for the training of automated syntactic parsers is also presented.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages123-131
Number of pages9
Volume3658 LNAI
Publication statusPublished - 2005
Event8th International Conference on Text, Speech and Dialogue, TSD 2005 - Karlovy Vary, Czech Republic
Duration: Sep 12 2005Sep 15 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3658 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other8th International Conference on Text, Speech and Dialogue, TSD 2005
CountryCzech Republic
CityKarlovy Vary
Period9/12/059/15/05

Fingerprint

Syntactics
Linguistics
Annotation
Language
Parsing
Databases
Research
Processing
Syntax

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Csendes, D., Csirik, J., Gyimóthy, T., & Kocsor, A. (2005). The Szeged Treebank. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3658 LNAI, pp. 123-131). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3658 LNAI).

The Szeged Treebank. / Csendes, Dóra; Csirik, J.; Gyimóthy, T.; Kocsor, András.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3658 LNAI 2005. p. 123-131 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3658 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Csendes, D, Csirik, J, Gyimóthy, T & Kocsor, A 2005, The Szeged Treebank. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 3658 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3658 LNAI, pp. 123-131, 8th International Conference on Text, Speech and Dialogue, TSD 2005, Karlovy Vary, Czech Republic, 9/12/05.
Csendes D, Csirik J, Gyimóthy T, Kocsor A. The Szeged Treebank. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3658 LNAI. 2005. p. 123-131. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Csendes, Dóra ; Csirik, J. ; Gyimóthy, T. ; Kocsor, András. / The Szeged Treebank. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3658 LNAI 2005. pp. 123-131 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{8b50827f05704b94a3caa61f78ce77a6,
title = "The Szeged Treebank",
abstract = "The major aim of the Szeged Treebank project was to create a high-quality database of syntactic structures for Hungarian that can serve as a golden standard to further research in linguistics and computational language processing. The treebank currently contains full syntactic parsing of about 82,000 sentences, which is the result of accurate manual annotation. Current paper describes the linguistic theory as well as the actual method used in the annotation process. In addition, the application of the treebank for the training of automated syntactic parsers is also presented.",
author = "D{\'o}ra Csendes and J. Csirik and T. Gyim{\'o}thy and Andr{\'a}s Kocsor",
year = "2005",
language = "English",
isbn = "3540287892",
volume = "3658 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "123--131",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - The Szeged Treebank

AU - Csendes, Dóra

AU - Csirik, J.

AU - Gyimóthy, T.

AU - Kocsor, András

PY - 2005

Y1 - 2005

N2 - The major aim of the Szeged Treebank project was to create a high-quality database of syntactic structures for Hungarian that can serve as a golden standard to further research in linguistics and computational language processing. The treebank currently contains full syntactic parsing of about 82,000 sentences, which is the result of accurate manual annotation. Current paper describes the linguistic theory as well as the actual method used in the annotation process. In addition, the application of the treebank for the training of automated syntactic parsers is also presented.

AB - The major aim of the Szeged Treebank project was to create a high-quality database of syntactic structures for Hungarian that can serve as a golden standard to further research in linguistics and computational language processing. The treebank currently contains full syntactic parsing of about 82,000 sentences, which is the result of accurate manual annotation. Current paper describes the linguistic theory as well as the actual method used in the annotation process. In addition, the application of the treebank for the training of automated syntactic parsers is also presented.

UR - http://www.scopus.com/inward/record.url?scp=33646062289&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646062289&partnerID=8YFLogxK

M3 - Conference contribution

SN - 3540287892

SN - 9783540287896

VL - 3658 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 123

EP - 131

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -