Lexical analysis of scientific publications for nano-level scientometrics

W. Glänzel, Sarah Heeffer, Bart Thijs

Research output: Contribution to journalArticle

3 Citations (Scopus)


In earlier studies (e.g. Glänzel and Thijs in Scientometrics, 2017) we have used components of text analysis in combination with link-based techniques to cluster documents spaces and to detect emerging research topics on the large scale. Taking up now the objectives of evaluative scientometrics, we attempt to link the textual analysis of small sets of individual scientific papers to evaluative bibliometrics. The objective is, however, quite similar. We focus on the detection of similarities and on monitoring structural changes but this time on the small scale. We proceed from earlier approaches used in quantitative linguistics applied to bibliometrics (Telcs et al. in Math Soc Sci; 10(2):169–178, 1985). In the present pilot study we have selected 18 papers by András Schubert and published in three different periods with 6 papers each: 1983–1985, 1993–1998 and 2010–2013. The objective is twofold: We first try only to detect linguistic regularities in the scientometric text by applying a Waring model to the analysis of Schubert’s vocabulary on the basis of all words and nouns. The second goal refers to the identification of changes in the used vocabulary over a period of three decades. The main findings are discussed along with future research tasks, which arise from these result in the context of the analysis of dynamics and emergence of research topics at the micro and nano level.

Original languageEnglish
Pages (from-to)1-10
Number of pages10
Publication statusAccepted/In press - Mar 9 2017


  • Nano-level analysis
  • Natural language processing
  • Quantitative linguistics
  • Waring distribution
  • Word-frequency

ASJC Scopus subject areas

  • Social Sciences(all)
  • Computer Science Applications
  • Library and Information Sciences
  • Law

Fingerprint Dive into the research topics of 'Lexical analysis of scientific publications for nano-level scientometrics'. Together they form a unique fingerprint.

  • Cite this