Lexical analysis of scientific publications for nano-level scientometrics

W. Glänzel, Sarah Heeffer, Bart Thijs

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

In earlier studies (e.g. Glänzel and Thijs in Scientometrics, 2017) we have used components of text analysis in combination with link-based techniques to cluster documents spaces and to detect emerging research topics on the large scale. Taking up now the objectives of evaluative scientometrics, we attempt to link the textual analysis of small sets of individual scientific papers to evaluative bibliometrics. The objective is, however, quite similar. We focus on the detection of similarities and on monitoring structural changes but this time on the small scale. We proceed from earlier approaches used in quantitative linguistics applied to bibliometrics (Telcs et al. in Math Soc Sci; 10(2):169–178, 1985). In the present pilot study we have selected 18 papers by András Schubert and published in three different periods with 6 papers each: 1983–1985, 1993–1998 and 2010–2013. The objective is twofold: We first try only to detect linguistic regularities in the scientometric text by applying a Waring model to the analysis of Schubert’s vocabulary on the basis of all words and nouns. The second goal refers to the identification of changes in the used vocabulary over a period of three decades. The main findings are discussed along with future research tasks, which arise from these result in the context of the analysis of dynamics and emergence of research topics at the micro and nano level.

Original languageEnglish
Pages (from-to)1-10
Number of pages10
JournalScientometrics
DOIs
Publication statusAccepted/In press - Mar 9 2017

Fingerprint

Linguistics
vocabulary
linguistics
text analysis
regularity
structural change
Monitoring
monitoring
present
time

Keywords

  • Nano-level analysis
  • Natural language processing
  • Quantitative linguistics
  • Waring distribution
  • Word-frequency

ASJC Scopus subject areas

  • Social Sciences(all)
  • Computer Science Applications
  • Library and Information Sciences
  • Law

Cite this

Lexical analysis of scientific publications for nano-level scientometrics. / Glänzel, W.; Heeffer, Sarah; Thijs, Bart.

In: Scientometrics, 09.03.2017, p. 1-10.

Research output: Contribution to journalArticle

@article{11bb7d468e0d4afc9cd639cfa5e5d466,
title = "Lexical analysis of scientific publications for nano-level scientometrics",
abstract = "In earlier studies (e.g. Gl{\"a}nzel and Thijs in Scientometrics, 2017) we have used components of text analysis in combination with link-based techniques to cluster documents spaces and to detect emerging research topics on the large scale. Taking up now the objectives of evaluative scientometrics, we attempt to link the textual analysis of small sets of individual scientific papers to evaluative bibliometrics. The objective is, however, quite similar. We focus on the detection of similarities and on monitoring structural changes but this time on the small scale. We proceed from earlier approaches used in quantitative linguistics applied to bibliometrics (Telcs et al. in Math Soc Sci; 10(2):169–178, 1985). In the present pilot study we have selected 18 papers by Andr{\'a}s Schubert and published in three different periods with 6 papers each: 1983–1985, 1993–1998 and 2010–2013. The objective is twofold: We first try only to detect linguistic regularities in the scientometric text by applying a Waring model to the analysis of Schubert’s vocabulary on the basis of all words and nouns. The second goal refers to the identification of changes in the used vocabulary over a period of three decades. The main findings are discussed along with future research tasks, which arise from these result in the context of the analysis of dynamics and emergence of research topics at the micro and nano level.",
keywords = "Nano-level analysis, Natural language processing, Quantitative linguistics, Waring distribution, Word-frequency",
author = "W. Gl{\"a}nzel and Sarah Heeffer and Bart Thijs",
year = "2017",
month = "3",
day = "9",
doi = "10.1007/s11192-017-2336-8",
language = "English",
pages = "1--10",
journal = "Scientometrics",
issn = "0138-9130",
publisher = "Springer Netherlands",

}

TY - JOUR

T1 - Lexical analysis of scientific publications for nano-level scientometrics

AU - Glänzel, W.

AU - Heeffer, Sarah

AU - Thijs, Bart

PY - 2017/3/9

Y1 - 2017/3/9

N2 - In earlier studies (e.g. Glänzel and Thijs in Scientometrics, 2017) we have used components of text analysis in combination with link-based techniques to cluster documents spaces and to detect emerging research topics on the large scale. Taking up now the objectives of evaluative scientometrics, we attempt to link the textual analysis of small sets of individual scientific papers to evaluative bibliometrics. The objective is, however, quite similar. We focus on the detection of similarities and on monitoring structural changes but this time on the small scale. We proceed from earlier approaches used in quantitative linguistics applied to bibliometrics (Telcs et al. in Math Soc Sci; 10(2):169–178, 1985). In the present pilot study we have selected 18 papers by András Schubert and published in three different periods with 6 papers each: 1983–1985, 1993–1998 and 2010–2013. The objective is twofold: We first try only to detect linguistic regularities in the scientometric text by applying a Waring model to the analysis of Schubert’s vocabulary on the basis of all words and nouns. The second goal refers to the identification of changes in the used vocabulary over a period of three decades. The main findings are discussed along with future research tasks, which arise from these result in the context of the analysis of dynamics and emergence of research topics at the micro and nano level.

AB - In earlier studies (e.g. Glänzel and Thijs in Scientometrics, 2017) we have used components of text analysis in combination with link-based techniques to cluster documents spaces and to detect emerging research topics on the large scale. Taking up now the objectives of evaluative scientometrics, we attempt to link the textual analysis of small sets of individual scientific papers to evaluative bibliometrics. The objective is, however, quite similar. We focus on the detection of similarities and on monitoring structural changes but this time on the small scale. We proceed from earlier approaches used in quantitative linguistics applied to bibliometrics (Telcs et al. in Math Soc Sci; 10(2):169–178, 1985). In the present pilot study we have selected 18 papers by András Schubert and published in three different periods with 6 papers each: 1983–1985, 1993–1998 and 2010–2013. The objective is twofold: We first try only to detect linguistic regularities in the scientometric text by applying a Waring model to the analysis of Schubert’s vocabulary on the basis of all words and nouns. The second goal refers to the identification of changes in the used vocabulary over a period of three decades. The main findings are discussed along with future research tasks, which arise from these result in the context of the analysis of dynamics and emergence of research topics at the micro and nano level.

KW - Nano-level analysis

KW - Natural language processing

KW - Quantitative linguistics

KW - Waring distribution

KW - Word-frequency

UR - http://www.scopus.com/inward/record.url?scp=85014650777&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014650777&partnerID=8YFLogxK

U2 - 10.1007/s11192-017-2336-8

DO - 10.1007/s11192-017-2336-8

M3 - Article

SP - 1

EP - 10

JO - Scientometrics

JF - Scientometrics

SN - 0138-9130

ER -