Hierarchical text categorization using fuzzy relational thesaurus

D. Tikk, Jae Dong Yang, Sun Lee Bang

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Text categorization is the classification to assign a text document to an appropriate category in a predefined set of categories. We present a new approach for the text categorization by means of Fuzzy Relational Thesaurus (FRT). FRT is a multilevel category system that stores and maintains adaptive local dictionary for each category. The goal of our approach is twofold; to develop a reliable text categorization method on a certain subject domain, and to expand the initial FRT by automatically added terms, thereby obtaining an incrementally defined knowledge base of the domain. We implemented the categorization algorithm and compared it with some other hierarchical classifiers. Experimental results have been shown that our algorithm outperforms its rivals on all document corpora investigated.

Original languageEnglish
Pages (from-to)583-600
Number of pages18
JournalKybernetika
Volume39
Issue number5
Publication statusPublished - 2003

Fingerprint

Thesauri
Thesaurus
Text Categorization
Glossaries
Classifiers
Categorization
Knowledge Base
Expand
Assign
Classifier
Experimental Results
Term

Keywords

  • Hierarchical text categorization
  • Knowledge base management
  • Multi-level categorization
  • Text mining

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Human-Computer Interaction

Cite this

Tikk, D., Yang, J. D., & Bang, S. L. (2003). Hierarchical text categorization using fuzzy relational thesaurus. Kybernetika, 39(5), 583-600.

Hierarchical text categorization using fuzzy relational thesaurus. / Tikk, D.; Yang, Jae Dong; Bang, Sun Lee.

In: Kybernetika, Vol. 39, No. 5, 2003, p. 583-600.

Research output: Contribution to journalArticle

Tikk, D, Yang, JD & Bang, SL 2003, 'Hierarchical text categorization using fuzzy relational thesaurus', Kybernetika, vol. 39, no. 5, pp. 583-600.
Tikk, D. ; Yang, Jae Dong ; Bang, Sun Lee. / Hierarchical text categorization using fuzzy relational thesaurus. In: Kybernetika. 2003 ; Vol. 39, No. 5. pp. 583-600.
@article{a7b7942cb1da4c2abe33dae6fc8100a7,
title = "Hierarchical text categorization using fuzzy relational thesaurus",
abstract = "Text categorization is the classification to assign a text document to an appropriate category in a predefined set of categories. We present a new approach for the text categorization by means of Fuzzy Relational Thesaurus (FRT). FRT is a multilevel category system that stores and maintains adaptive local dictionary for each category. The goal of our approach is twofold; to develop a reliable text categorization method on a certain subject domain, and to expand the initial FRT by automatically added terms, thereby obtaining an incrementally defined knowledge base of the domain. We implemented the categorization algorithm and compared it with some other hierarchical classifiers. Experimental results have been shown that our algorithm outperforms its rivals on all document corpora investigated.",
keywords = "Hierarchical text categorization, Knowledge base management, Multi-level categorization, Text mining",
author = "D. Tikk and Yang, {Jae Dong} and Bang, {Sun Lee}",
year = "2003",
language = "English",
volume = "39",
pages = "583--600",
journal = "Kybernetika",
issn = "0023-5954",
publisher = "Academy of Sciences of the Czech Republic",
number = "5",

}

TY - JOUR

T1 - Hierarchical text categorization using fuzzy relational thesaurus

AU - Tikk, D.

AU - Yang, Jae Dong

AU - Bang, Sun Lee

PY - 2003

Y1 - 2003

N2 - Text categorization is the classification to assign a text document to an appropriate category in a predefined set of categories. We present a new approach for the text categorization by means of Fuzzy Relational Thesaurus (FRT). FRT is a multilevel category system that stores and maintains adaptive local dictionary for each category. The goal of our approach is twofold; to develop a reliable text categorization method on a certain subject domain, and to expand the initial FRT by automatically added terms, thereby obtaining an incrementally defined knowledge base of the domain. We implemented the categorization algorithm and compared it with some other hierarchical classifiers. Experimental results have been shown that our algorithm outperforms its rivals on all document corpora investigated.

AB - Text categorization is the classification to assign a text document to an appropriate category in a predefined set of categories. We present a new approach for the text categorization by means of Fuzzy Relational Thesaurus (FRT). FRT is a multilevel category system that stores and maintains adaptive local dictionary for each category. The goal of our approach is twofold; to develop a reliable text categorization method on a certain subject domain, and to expand the initial FRT by automatically added terms, thereby obtaining an incrementally defined knowledge base of the domain. We implemented the categorization algorithm and compared it with some other hierarchical classifiers. Experimental results have been shown that our algorithm outperforms its rivals on all document corpora investigated.

KW - Hierarchical text categorization

KW - Knowledge base management

KW - Multi-level categorization

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=1642413309&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1642413309&partnerID=8YFLogxK

M3 - Article

VL - 39

SP - 583

EP - 600

JO - Kybernetika

JF - Kybernetika

SN - 0023-5954

IS - 5

ER -