Bit-table based biclustering and frequent closed itemset mining in high-dimensional binary data

András Király, Attila Gyenesei, J. Abonyi

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in high-dimensional data. The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data) and biclustering (applied to gene expression data analysis). The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner. The proposed algorithm has been implemented in the commonly used MATLAB environment and freely available for researchers.

Original languageEnglish
Article number870406
JournalTheScientificWorldJournal [electronic resource]
Volume2014
DOIs
Publication statusPublished - 2014

Fingerprint

Gene expression
MATLAB
Research Personnel
Gene Expression
gene expression
Research
matrix
market
methodology
method
Datasets
data analysis

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Environmental Science(all)
  • Medicine(all)

Cite this

Bit-table based biclustering and frequent closed itemset mining in high-dimensional binary data. / Király, András; Gyenesei, Attila; Abonyi, J.

In: TheScientificWorldJournal [electronic resource], Vol. 2014, 870406, 2014.

Research output: Contribution to journalArticle

@article{4a3d13e61a7a44dd96fbd235760ecefc,
title = "Bit-table based biclustering and frequent closed itemset mining in high-dimensional binary data",
abstract = "During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in high-dimensional data. The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data) and biclustering (applied to gene expression data analysis). The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner. The proposed algorithm has been implemented in the commonly used MATLAB environment and freely available for researchers.",
author = "Andr{\'a}s Kir{\'a}ly and Attila Gyenesei and J. Abonyi",
year = "2014",
doi = "10.1155/2014/870406",
language = "English",
volume = "2014",
journal = "The Scientific World Journal",
issn = "1537-744X",
publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - Bit-table based biclustering and frequent closed itemset mining in high-dimensional binary data

AU - Király, András

AU - Gyenesei, Attila

AU - Abonyi, J.

PY - 2014

Y1 - 2014

N2 - During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in high-dimensional data. The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data) and biclustering (applied to gene expression data analysis). The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner. The proposed algorithm has been implemented in the commonly used MATLAB environment and freely available for researchers.

AB - During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in high-dimensional data. The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data) and biclustering (applied to gene expression data analysis). The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner. The proposed algorithm has been implemented in the commonly used MATLAB environment and freely available for researchers.

UR - http://www.scopus.com/inward/record.url?scp=84896853887&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84896853887&partnerID=8YFLogxK

U2 - 10.1155/2014/870406

DO - 10.1155/2014/870406

M3 - Article

VL - 2014

JO - The Scientific World Journal

JF - The Scientific World Journal

SN - 1537-744X

M1 - 870406

ER -