Enhancing Sampling of Papers Through an Interactive Shiny Tool for Bibliometric Analysis and Systematic Reviews
A Hybrid Keyword-Semantic Approach to Abstract Ranking
DOI:
https://doi.org/10.14429/djlit.21023Keywords:
Quantitative literature review, Bibliometrics, Sampling, Abstract ranking, Tool, Keyword ranking, Semantic rankingAbstract
This paper presents a novel interactive tool for sampling papers in bibliometric analysis and systematic reviews, integrating keyword frequency analysis and semantic similarity ranking. Built using R Shiny, the application enables researchers to prioritize academic abstracts through a dual-method approach systematically: Term frequency-inverse document frequency (TF-IDF) weighted keyword matching, and Cosine similarity-based semantic alignment with user-defined queries. A dynamic weighting mechanism enhances the hybrid approach, outperforming traditional methods by balancing lexical precision with contextual depth. The tool addresses critical challenges in quantitative literature review processes by introducing data-driven thresholding with three-tier prioritisation (green/orange/red categories) and export functionalities. In test cases, the hybrid approach classified 2.29 % of papers as highly relevant using keyword analysis and 1.47 % using semantic similarity on an AI in libraries dataset, with broader coverage (58.45% moderately relevant) in hybrid mode, demonstrating its ability to identify contextually aligned works efficiently. Technical implementation details, mathematical foundations, and applications are discussed. The tool supports extracting relevant papers from a dataset drawn from Web of Science, Scopus, OpenAlex, and Dimensions.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Defence Scientific Information & Documentation Centre (DESIDOC)Except where otherwise noted, the Articles on this site are licensed under Creative Commons License: CC Attribution-Noncommercial-No Derivative Works 2.5 India