Enhancing Sampling of Papers Through an Interactive Shiny Tool for Bibliometric Analysis and Systematic Reviews

A Hybrid Keyword-Semantic Approach to Abstract Ranking

Authors

DOI:

https://doi.org/10.14429/djlit.21023

Keywords:

Quantitative literature review, Bibliometrics, Sampling, Abstract ranking, Tool, Keyword ranking, Semantic ranking

Abstract

This paper presents a novel interactive tool for sampling papers in bibliometric analysis and systematic reviews, integrating keyword frequency analysis and semantic similarity ranking. Built using R Shiny, the application enables researchers to prioritize academic abstracts through a dual-method approach systematically: Term frequency-inverse document frequency (TF-IDF) weighted keyword matching, and Cosine similarity-based semantic alignment with user-defined queries. A dynamic weighting mechanism enhances the hybrid approach, outperforming traditional methods by balancing lexical precision with contextual depth. The tool addresses critical challenges in quantitative literature review processes by introducing data-driven thresholding with three-tier prioritisation (green/orange/red categories) and export functionalities. In test cases, the hybrid approach classified 2.29 % of papers as highly relevant using keyword analysis and 1.47 % using semantic similarity on an AI in libraries dataset, with broader coverage (58.45% moderately relevant) in hybrid mode, demonstrating its ability to identify contextually aligned works efficiently. Technical implementation details, mathematical foundations, and applications are discussed. The tool supports extracting relevant papers from a dataset drawn from Web of Science, Scopus, OpenAlex, and Dimensions.

Downloads

Published

2025-07-14

How to Cite

Yuvaraj, M. (2025). Enhancing Sampling of Papers Through an Interactive Shiny Tool for Bibliometric Analysis and Systematic Reviews: A Hybrid Keyword-Semantic Approach to Abstract Ranking. DESIDOC Journal of Library & Information Technology, 45(4), 277–285. https://doi.org/10.14429/djlit.21023