Hierarchical text categorization using fuzzy relational thesaurus

Domonkos Tikk, Jae Dong Yang, Sun Lee Bang

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Text categorization is the classification to assign a text document to an appropriate category in a predefined set of categories. We present a new approach for the text categorization by means of Fuzzy Relational Thesaurus (FRT). FRT is a multilevel category system that stores and maintains adaptive local dictionary for each category. The goal of our approach is twofold; to develop a reliable text categorization method on a certain subject domain, and to expand the initial FRT by automatically added terms, thereby obtaining an incrementally defined knowledge base of the domain. We implemented the categorization algorithm and compared it with some other hierarchical classifiers. Experimental results have been shown that our algorithm outperforms its rivals on all document corpora investigated.

Original languageEnglish
Pages (from-to)583-600
Number of pages18
JournalKybernetika
Volume39
Issue number5
Publication statusPublished - Dec 1 2003

    Fingerprint

Keywords

  • Hierarchical text categorization
  • Knowledge base management
  • Multi-level categorization
  • Text mining

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Information Systems
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Cite this

Tikk, D., Yang, J. D., & Bang, S. L. (2003). Hierarchical text categorization using fuzzy relational thesaurus. Kybernetika, 39(5), 583-600.