A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data

Large language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuni...

Full description

Saved in:
Bibliographic Details
Main Authors: Abhay Bindle, Preeti Singla, Sachin Sharma, Abdukodir Khakimov, Reem Ibrahim Alkanhel, Ammar Muthanna
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11071302/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Large language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuning, LLMs are able to carry out a variety of downstream activities in response to human instructions. This paper presents hybrid document retrieval and ranking approach that integrates statistical, probabilistic, and neural network-based retrieval models to enhance information retrieval performance in telecommunication domain. Traditional methods such as Term Frequency–Inverse Document Frequency (TF-IDF), and Best Match 25 (BM25) provide effective lexical matching, while deep learning-based models like Sentence-BERT (SBERT), and Word to Vector (Word2Vec) improve semantic understanding by capturing contextual relationships between query and document representations. The proposed framework introduces a novel multi-stage ranking mechanism that strategically integrates term-frequency-based scoring with semantic similarity modelling using Sentence-BERT and Word2Vec. Unlike existing models, our method dynamically adjusts weights across lexical and semantic components based on query features, enabling real-time adaptation for telecom-specific QA tasks. Performance evaluation is conducted using BLEU Score, ROUGE metrics, Cosine Similarity, and Word2Vec Similarity, demonstrating that the hybrid model outperforms conventional retrieval baselines in both precision and recall-oriented tasks. The proposed model effectively aligns query intent with retrieved documents, increase in efficiency of domain-specific search. The future scope includes dynamic embedding techniques to handle domain adaptation and attention-based ranking optimizations for long-form information retrieval. This research enhances information retrieval by combining machine learning-based ranking with traditional methods, improving knowledge discovery and decision-making in telecommunications and technical document processing.
ISSN:2169-3536