Hybrid clustering of text mining and bibliometrics applied to journal sets

Xinhai Liu, Shi Yu, Yves Moreau, Bart De Moor, W. Glänzel, Frizo Janssens

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

To obtain correlated and complementary information contained in text mining and bibliometrics, hybrid clustering to incorporate textual content and citation information has become a popular strategy. In this paper, we propose a new computational framework of integrating text mining and bibliometrics to provide a mapping of journal sets. Two different approaches of hybrid clustering methods are applied in this paper. The first category is ensemble clustering, which combines different clustering results obtained from individual data into a consolidated clustering result. The second category is kernel fusion, which maps heterogeneous data sets into the kernel space and combines the kernel matrices for clustering. Kernels can be combined either averagely, or by an optimized weighted linear combination model. In this paper, we propose a novel adaptive kernel K-means clustering algorithm to combine textual content and citation information for clustering. The proposed algorithm is systematically compared with other methods on a clustering problem of 1869 journals published in 2002-2006. Based on several validation indices, the experimental results demonstrate that our hybrid clustering strategy is able to provide clustering result as well as the best individual data source.

Original languageEnglish
Title of host publicationSociety for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics
Pages48-59
Number of pages12
Volume1
Publication statusPublished - 2009
Event9th SIAM International Conference on Data Mining 2009, SDM 2009 - Sparks, NV, United States
Duration: Apr 30 2009May 2 2009

Other

Other9th SIAM International Conference on Data Mining 2009, SDM 2009
CountryUnited States
CitySparks, NV
Period4/30/095/2/09

Fingerprint

Bibliometrics
Text Mining
Clustering
kernel
Clustering algorithms
Fusion reactions
Citations
K-means Algorithm
K-means Clustering
Hybrid Method
Clustering Methods
Clustering Algorithm
Linear Combination
Fusion
Ensemble

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software
  • Applied Mathematics

Cite this

Liu, X., Yu, S., Moreau, Y., De Moor, B., Glänzel, W., & Janssens, F. (2009). Hybrid clustering of text mining and bibliometrics applied to journal sets. In Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics (Vol. 1, pp. 48-59)

Hybrid clustering of text mining and bibliometrics applied to journal sets. / Liu, Xinhai; Yu, Shi; Moreau, Yves; De Moor, Bart; Glänzel, W.; Janssens, Frizo.

Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics. Vol. 1 2009. p. 48-59.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Liu, X, Yu, S, Moreau, Y, De Moor, B, Glänzel, W & Janssens, F 2009, Hybrid clustering of text mining and bibliometrics applied to journal sets. in Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics. vol. 1, pp. 48-59, 9th SIAM International Conference on Data Mining 2009, SDM 2009, Sparks, NV, United States, 4/30/09.
Liu X, Yu S, Moreau Y, De Moor B, Glänzel W, Janssens F. Hybrid clustering of text mining and bibliometrics applied to journal sets. In Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics. Vol. 1. 2009. p. 48-59
Liu, Xinhai ; Yu, Shi ; Moreau, Yves ; De Moor, Bart ; Glänzel, W. ; Janssens, Frizo. / Hybrid clustering of text mining and bibliometrics applied to journal sets. Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics. Vol. 1 2009. pp. 48-59
@inproceedings{a7f2d0b2791847799c83c0edc7d3c50a,
title = "Hybrid clustering of text mining and bibliometrics applied to journal sets",
abstract = "To obtain correlated and complementary information contained in text mining and bibliometrics, hybrid clustering to incorporate textual content and citation information has become a popular strategy. In this paper, we propose a new computational framework of integrating text mining and bibliometrics to provide a mapping of journal sets. Two different approaches of hybrid clustering methods are applied in this paper. The first category is ensemble clustering, which combines different clustering results obtained from individual data into a consolidated clustering result. The second category is kernel fusion, which maps heterogeneous data sets into the kernel space and combines the kernel matrices for clustering. Kernels can be combined either averagely, or by an optimized weighted linear combination model. In this paper, we propose a novel adaptive kernel K-means clustering algorithm to combine textual content and citation information for clustering. The proposed algorithm is systematically compared with other methods on a clustering problem of 1869 journals published in 2002-2006. Based on several validation indices, the experimental results demonstrate that our hybrid clustering strategy is able to provide clustering result as well as the best individual data source.",
author = "Xinhai Liu and Shi Yu and Yves Moreau and {De Moor}, Bart and W. Gl{\"a}nzel and Frizo Janssens",
year = "2009",
language = "English",
isbn = "9781615671090",
volume = "1",
pages = "48--59",
booktitle = "Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics",

}

TY - GEN

T1 - Hybrid clustering of text mining and bibliometrics applied to journal sets

AU - Liu, Xinhai

AU - Yu, Shi

AU - Moreau, Yves

AU - De Moor, Bart

AU - Glänzel, W.

AU - Janssens, Frizo

PY - 2009

Y1 - 2009

N2 - To obtain correlated and complementary information contained in text mining and bibliometrics, hybrid clustering to incorporate textual content and citation information has become a popular strategy. In this paper, we propose a new computational framework of integrating text mining and bibliometrics to provide a mapping of journal sets. Two different approaches of hybrid clustering methods are applied in this paper. The first category is ensemble clustering, which combines different clustering results obtained from individual data into a consolidated clustering result. The second category is kernel fusion, which maps heterogeneous data sets into the kernel space and combines the kernel matrices for clustering. Kernels can be combined either averagely, or by an optimized weighted linear combination model. In this paper, we propose a novel adaptive kernel K-means clustering algorithm to combine textual content and citation information for clustering. The proposed algorithm is systematically compared with other methods on a clustering problem of 1869 journals published in 2002-2006. Based on several validation indices, the experimental results demonstrate that our hybrid clustering strategy is able to provide clustering result as well as the best individual data source.

AB - To obtain correlated and complementary information contained in text mining and bibliometrics, hybrid clustering to incorporate textual content and citation information has become a popular strategy. In this paper, we propose a new computational framework of integrating text mining and bibliometrics to provide a mapping of journal sets. Two different approaches of hybrid clustering methods are applied in this paper. The first category is ensemble clustering, which combines different clustering results obtained from individual data into a consolidated clustering result. The second category is kernel fusion, which maps heterogeneous data sets into the kernel space and combines the kernel matrices for clustering. Kernels can be combined either averagely, or by an optimized weighted linear combination model. In this paper, we propose a novel adaptive kernel K-means clustering algorithm to combine textual content and citation information for clustering. The proposed algorithm is systematically compared with other methods on a clustering problem of 1869 journals published in 2002-2006. Based on several validation indices, the experimental results demonstrate that our hybrid clustering strategy is able to provide clustering result as well as the best individual data source.

UR - http://www.scopus.com/inward/record.url?scp=72849140388&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=72849140388&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:72849140388

SN - 9781615671090

VL - 1

SP - 48

EP - 59

BT - Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics

ER -