Do second-order similarities provide added-value in a hybrid approach?

Bart Thijs, Edgar Schiebel, W. Glänzel

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

Recent studies on first- and second-order similarities have shown that the latter one outperforms the first one as input for document clustering or partitioning applications. First-order similarities based on bibliographic coupling or on lexical approaches come with specific methodological issues like sparse matrices, sensitive to spelling variances or context differences. Second-order similarities were proposed to tackle these problems and take the lexical context into account. But also a hybrid combination of both types of similarities proved an important improvement which integrates the strengths of the two approaches and diminishes their weaknesses. In this paper we extend the notion of second-order similarity by applying it in the context of the hybrid approach. We conclude that there is no added value for the clearly defined clusters but that the second-order similarity can provide an additional viewpoint for the more general clusters.

Original languageEnglish
Pages (from-to)667-677
Number of pages11
JournalScientometrics
Volume96
Issue number3
DOIs
Publication statusPublished - Sep 2013

Fingerprint

value added

Keywords

  • Bibliographic coupling
  • Hybrid clustering
  • Public health
  • Similarity measures
  • Text mining

ASJC Scopus subject areas

  • Computer Science Applications
  • Social Sciences(all)
  • Library and Information Sciences
  • Law

Cite this

Do second-order similarities provide added-value in a hybrid approach? / Thijs, Bart; Schiebel, Edgar; Glänzel, W.

In: Scientometrics, Vol. 96, No. 3, 09.2013, p. 667-677.

Research output: Contribution to journalArticle

Thijs, Bart ; Schiebel, Edgar ; Glänzel, W. / Do second-order similarities provide added-value in a hybrid approach?. In: Scientometrics. 2013 ; Vol. 96, No. 3. pp. 667-677.
@article{187ac4dfbed34d65967ca5120d63e5bb,
title = "Do second-order similarities provide added-value in a hybrid approach?",
abstract = "Recent studies on first- and second-order similarities have shown that the latter one outperforms the first one as input for document clustering or partitioning applications. First-order similarities based on bibliographic coupling or on lexical approaches come with specific methodological issues like sparse matrices, sensitive to spelling variances or context differences. Second-order similarities were proposed to tackle these problems and take the lexical context into account. But also a hybrid combination of both types of similarities proved an important improvement which integrates the strengths of the two approaches and diminishes their weaknesses. In this paper we extend the notion of second-order similarity by applying it in the context of the hybrid approach. We conclude that there is no added value for the clearly defined clusters but that the second-order similarity can provide an additional viewpoint for the more general clusters.",
keywords = "Bibliographic coupling, Hybrid clustering, Public health, Similarity measures, Text mining",
author = "Bart Thijs and Edgar Schiebel and W. Gl{\"a}nzel",
year = "2013",
month = "9",
doi = "10.1007/s11192-012-0896-1",
language = "English",
volume = "96",
pages = "667--677",
journal = "Scientometrics",
issn = "0138-9130",
publisher = "Springer Netherlands",
number = "3",

}

TY - JOUR

T1 - Do second-order similarities provide added-value in a hybrid approach?

AU - Thijs, Bart

AU - Schiebel, Edgar

AU - Glänzel, W.

PY - 2013/9

Y1 - 2013/9

N2 - Recent studies on first- and second-order similarities have shown that the latter one outperforms the first one as input for document clustering or partitioning applications. First-order similarities based on bibliographic coupling or on lexical approaches come with specific methodological issues like sparse matrices, sensitive to spelling variances or context differences. Second-order similarities were proposed to tackle these problems and take the lexical context into account. But also a hybrid combination of both types of similarities proved an important improvement which integrates the strengths of the two approaches and diminishes their weaknesses. In this paper we extend the notion of second-order similarity by applying it in the context of the hybrid approach. We conclude that there is no added value for the clearly defined clusters but that the second-order similarity can provide an additional viewpoint for the more general clusters.

AB - Recent studies on first- and second-order similarities have shown that the latter one outperforms the first one as input for document clustering or partitioning applications. First-order similarities based on bibliographic coupling or on lexical approaches come with specific methodological issues like sparse matrices, sensitive to spelling variances or context differences. Second-order similarities were proposed to tackle these problems and take the lexical context into account. But also a hybrid combination of both types of similarities proved an important improvement which integrates the strengths of the two approaches and diminishes their weaknesses. In this paper we extend the notion of second-order similarity by applying it in the context of the hybrid approach. We conclude that there is no added value for the clearly defined clusters but that the second-order similarity can provide an additional viewpoint for the more general clusters.

KW - Bibliographic coupling

KW - Hybrid clustering

KW - Public health

KW - Similarity measures

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=84881615233&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84881615233&partnerID=8YFLogxK

U2 - 10.1007/s11192-012-0896-1

DO - 10.1007/s11192-012-0896-1

M3 - Article

VL - 96

SP - 667

EP - 677

JO - Scientometrics

JF - Scientometrics

SN - 0138-9130

IS - 3

ER -