Using hybrid methods and ‘core documents’ for the representation of clusters and topics: the astronomy dataset

W. Glänzel, Bart Thijs

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Based on a dataset on Astronomy and Astrophysics, hybrid cluster analyses have been conducted. In order to obtain an optimum solution and to analyse possible issues resulting from the bibliometric methodologies used, we have systematically studied three models and, within these models, two scenarios each. The hybrid clustering was based on a combination of bibliographic coupling and textual similarities using the Louvain method at two resolution levels. The procedure resulted in three clearly hierarchical structures with six and thirteen, seven and thirteen and finally five and eleven clusters, respectively. These structures are analysed with the help of a concordance table. The statistics reflect a high quality of classification. The results of these three models are presented, discussed and compared with each other. For labelling and interpreting clusters, core documents representing the obtained clusters are used. Furthermore, these core documents help depict the internal structure of the complete network and the clusters. This work has been done as part of the international project ‘Measuring the Diversity of Research’ and in the framework a special workshop on the comparative analysis of algorithms for the identification of topics in science organised in Berlin in August 2014.

Original languageEnglish
Pages (from-to)1071-1087
Number of pages17
JournalScientometrics
Volume111
Issue number2
DOIs
Publication statusPublished - May 1 2017

Fingerprint

Astronomy
Astrophysics
Berlin
Labeling
statistics
Statistics
scenario
methodology
science

Keywords

  • Astronomy
  • Astrophysics
  • Bibliographic coupling
  • Clustering
  • Core documents
  • Hybrid clustering
  • NLP

ASJC Scopus subject areas

  • Social Sciences(all)
  • Computer Science Applications
  • Library and Information Sciences
  • Law

Cite this

Using hybrid methods and ‘core documents’ for the representation of clusters and topics : the astronomy dataset. / Glänzel, W.; Thijs, Bart.

In: Scientometrics, Vol. 111, No. 2, 01.05.2017, p. 1071-1087.

Research output: Contribution to journalArticle

@article{0fbee02aedbe4c1c984834bf9f667088,
title = "Using hybrid methods and ‘core documents’ for the representation of clusters and topics: the astronomy dataset",
abstract = "Based on a dataset on Astronomy and Astrophysics, hybrid cluster analyses have been conducted. In order to obtain an optimum solution and to analyse possible issues resulting from the bibliometric methodologies used, we have systematically studied three models and, within these models, two scenarios each. The hybrid clustering was based on a combination of bibliographic coupling and textual similarities using the Louvain method at two resolution levels. The procedure resulted in three clearly hierarchical structures with six and thirteen, seven and thirteen and finally five and eleven clusters, respectively. These structures are analysed with the help of a concordance table. The statistics reflect a high quality of classification. The results of these three models are presented, discussed and compared with each other. For labelling and interpreting clusters, core documents representing the obtained clusters are used. Furthermore, these core documents help depict the internal structure of the complete network and the clusters. This work has been done as part of the international project ‘Measuring the Diversity of Research’ and in the framework a special workshop on the comparative analysis of algorithms for the identification of topics in science organised in Berlin in August 2014.",
keywords = "Astronomy, Astrophysics, Bibliographic coupling, Clustering, Core documents, Hybrid clustering, NLP",
author = "W. Gl{\"a}nzel and Bart Thijs",
year = "2017",
month = "5",
day = "1",
doi = "10.1007/s11192-017-2301-6",
language = "English",
volume = "111",
pages = "1071--1087",
journal = "Scientometrics",
issn = "0138-9130",
publisher = "Springer Netherlands",
number = "2",

}

TY - JOUR

T1 - Using hybrid methods and ‘core documents’ for the representation of clusters and topics

T2 - the astronomy dataset

AU - Glänzel, W.

AU - Thijs, Bart

PY - 2017/5/1

Y1 - 2017/5/1

N2 - Based on a dataset on Astronomy and Astrophysics, hybrid cluster analyses have been conducted. In order to obtain an optimum solution and to analyse possible issues resulting from the bibliometric methodologies used, we have systematically studied three models and, within these models, two scenarios each. The hybrid clustering was based on a combination of bibliographic coupling and textual similarities using the Louvain method at two resolution levels. The procedure resulted in three clearly hierarchical structures with six and thirteen, seven and thirteen and finally five and eleven clusters, respectively. These structures are analysed with the help of a concordance table. The statistics reflect a high quality of classification. The results of these three models are presented, discussed and compared with each other. For labelling and interpreting clusters, core documents representing the obtained clusters are used. Furthermore, these core documents help depict the internal structure of the complete network and the clusters. This work has been done as part of the international project ‘Measuring the Diversity of Research’ and in the framework a special workshop on the comparative analysis of algorithms for the identification of topics in science organised in Berlin in August 2014.

AB - Based on a dataset on Astronomy and Astrophysics, hybrid cluster analyses have been conducted. In order to obtain an optimum solution and to analyse possible issues resulting from the bibliometric methodologies used, we have systematically studied three models and, within these models, two scenarios each. The hybrid clustering was based on a combination of bibliographic coupling and textual similarities using the Louvain method at two resolution levels. The procedure resulted in three clearly hierarchical structures with six and thirteen, seven and thirteen and finally five and eleven clusters, respectively. These structures are analysed with the help of a concordance table. The statistics reflect a high quality of classification. The results of these three models are presented, discussed and compared with each other. For labelling and interpreting clusters, core documents representing the obtained clusters are used. Furthermore, these core documents help depict the internal structure of the complete network and the clusters. This work has been done as part of the international project ‘Measuring the Diversity of Research’ and in the framework a special workshop on the comparative analysis of algorithms for the identification of topics in science organised in Berlin in August 2014.

KW - Astronomy

KW - Astrophysics

KW - Bibliographic coupling

KW - Clustering

KW - Core documents

KW - Hybrid clustering

KW - NLP

UR - http://www.scopus.com/inward/record.url?scp=85013392644&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013392644&partnerID=8YFLogxK

U2 - 10.1007/s11192-017-2301-6

DO - 10.1007/s11192-017-2301-6

M3 - Article

AN - SCOPUS:85013392644

VL - 111

SP - 1071

EP - 1087

JO - Scientometrics

JF - Scientometrics

SN - 0138-9130

IS - 2

ER -