A document classification algorithm using the fuzzy set theory and hierarchical structure of document

Seok Woo Han, Hye Jue Eun, Yong Sung Kim, L. Kóczy

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

In present, Information retrieval systems which are simply expressed with combination between keywords and phrase search according to the direct keyword matching method to get the information which users need. But Web documents retrieval systems serve too many documents because of term ambiguity. Also it often happens that words with several meanings occur in a document, but in a rather different context from that expected by the querying person. So the user should need extra time and effort to get more close documents. To overcome these problems, in this paper we propose an information retrieval system based on the content, which connects documents according to the degree of semantic link which it express fuzzy value by fuzzy function. Also we propose an algorithm which it produce the hierarchical structure using the degree of concepts and contents among documents. As result, we are able to select and to provide user-interested documents.

Original languageEnglish
Pages (from-to)122-133
Number of pages12
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3043
Publication statusPublished - 2004

Fingerprint

Document Classification
Information retrieval systems
Fuzzy set theory
Fuzzy Set Theory
Classification Algorithm
Hierarchical Structure
Information Systems
Information Retrieval
Semantics
Document Retrieval
Fuzzy Function
Person
Express
Term

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

@article{344f6fcf6ec7465b9beaf326dce183e3,
title = "A document classification algorithm using the fuzzy set theory and hierarchical structure of document",
abstract = "In present, Information retrieval systems which are simply expressed with combination between keywords and phrase search according to the direct keyword matching method to get the information which users need. But Web documents retrieval systems serve too many documents because of term ambiguity. Also it often happens that words with several meanings occur in a document, but in a rather different context from that expected by the querying person. So the user should need extra time and effort to get more close documents. To overcome these problems, in this paper we propose an information retrieval system based on the content, which connects documents according to the degree of semantic link which it express fuzzy value by fuzzy function. Also we propose an algorithm which it produce the hierarchical structure using the degree of concepts and contents among documents. As result, we are able to select and to provide user-interested documents.",
author = "Han, {Seok Woo} and Eun, {Hye Jue} and Kim, {Yong Sung} and L. K{\'o}czy",
year = "2004",
language = "English",
volume = "3043",
pages = "122--133",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - A document classification algorithm using the fuzzy set theory and hierarchical structure of document

AU - Han, Seok Woo

AU - Eun, Hye Jue

AU - Kim, Yong Sung

AU - Kóczy, L.

PY - 2004

Y1 - 2004

N2 - In present, Information retrieval systems which are simply expressed with combination between keywords and phrase search according to the direct keyword matching method to get the information which users need. But Web documents retrieval systems serve too many documents because of term ambiguity. Also it often happens that words with several meanings occur in a document, but in a rather different context from that expected by the querying person. So the user should need extra time and effort to get more close documents. To overcome these problems, in this paper we propose an information retrieval system based on the content, which connects documents according to the degree of semantic link which it express fuzzy value by fuzzy function. Also we propose an algorithm which it produce the hierarchical structure using the degree of concepts and contents among documents. As result, we are able to select and to provide user-interested documents.

AB - In present, Information retrieval systems which are simply expressed with combination between keywords and phrase search according to the direct keyword matching method to get the information which users need. But Web documents retrieval systems serve too many documents because of term ambiguity. Also it often happens that words with several meanings occur in a document, but in a rather different context from that expected by the querying person. So the user should need extra time and effort to get more close documents. To overcome these problems, in this paper we propose an information retrieval system based on the content, which connects documents according to the degree of semantic link which it express fuzzy value by fuzzy function. Also we propose an algorithm which it produce the hierarchical structure using the degree of concepts and contents among documents. As result, we are able to select and to provide user-interested documents.

UR - http://www.scopus.com/inward/record.url?scp=35048874319&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35048874319&partnerID=8YFLogxK

M3 - Article

VL - 3043

SP - 122

EP - 133

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -