Using noun phrases extraction for the improvement of hybrid clustering with text- and citation-based components. The example of "information System Research"

Bart Thijs, Wolfgang Glänzel, Martin Meyer

Research output: Contribution to journalConference article

4 Citations (Scopus)

Abstract

The hybrid clustering approach combining lexical and link-based similarities suffered for a long time from the different properties of the underlying networks. We propose a method based on noun phrase extraction using natural language processing to improve the measurement of the lexical component. Term shingles of different length are created form each of the extracted noun phrases. Hybrid networks are built based on weighted combination of the two types of similarities with seven different weights. We conclude that removing all single term shingles provides the best results at the level of computational feasibility, comparability with bibliographic coupling and also in a community detection application.

Original languageEnglish
Pages (from-to)28-33
Number of pages6
JournalCEUR Workshop Proceedings
Volume1384
Issue numberJanuary
Publication statusPublished - Jan 1 2015
Event1st Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics, CLBib 2015 - co-located with 15th International Society of Scientometrics and Informetrics Conference, ISSI 2015 - Istanbul, Turkey
Duration: Jun 29 2015 → …

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Using noun phrases extraction for the improvement of hybrid clustering with text- and citation-based components. The example of "information System Research"'. Together they form a unique fingerprint.

  • Cite this