Efficient reinforcement learning-based method for plagiarism detection boosted by a population-based algorithm for pretraining weights
Article
Xiong, Jiale, Yang, Jing, Yan, Lei, Awais, Muhammad, Khan, Abdullah Ayub, Alizadehsani, Roohallah and Acharya, U. Rajendra. 2024. "Efficient reinforcement learning-based method for plagiarism detection boosted by a population-based algorithm for pretraining weights." Expert Systems with Applications. 238 (Part E). https://doi.org/10.1016/j.eswa.2023.122088
Article Title | Efficient reinforcement learning-based method for plagiarism detection boosted by a population-based algorithm for pretraining weights |
---|---|
ERA Journal ID | 17852 |
Article Category | Article |
Authors | Xiong, Jiale, Yang, Jing, Yan, Lei, Awais, Muhammad, Khan, Abdullah Ayub, Alizadehsani, Roohallah and Acharya, U. Rajendra |
Journal Title | Expert Systems with Applications |
Journal Citation | 238 (Part E) |
Article Number | 122088 |
Number of Pages | 18 |
Year | 2024 |
Publisher | Elsevier |
Place of Publication | United Kingdom |
ISSN | 0957-4174 |
1873-6793 | |
Digital Object Identifier (DOI) | https://doi.org/10.1016/j.eswa.2023.122088 |
Web Address (URL) | https://www.sciencedirect.com/science/article/pii/S0957417423025903 |
Abstract | Plagiarism detection (PD) in natural language processing involves locating similar words in two distinct sources. The paper introduces a new approach to plagiarism detection utilizing bidirectional encoder representations from transformers (BERT)-generated embedding, an enhanced artificial bee colony (ABC) optimization algorithm for pre-training, and a training process based on reinforcement learning (RL). The BERT model can be incorporated into a subsequent task and meticulously refined to function as a model, enabling it to apprehend a variety of linguistic characteristics. Imbalanced classification is one of the fundamental obstacles to PD. To handle this predicament, we present a novel methodology utilizing RL, in which the problem is framed as a series of sequential decisions in which an agent receives a reward at each level for classifying a received instance. To address the disparity between classes, it is determined that the majority class will receive a lower reward than the minority class. We also focus on the training stage, which often utilizes gradient-based learning techniques like backpropagation (BP), leading to certain drawbacks such as sensitivity to initialization. In our proposed model, we utilize a mutual learning-based ABC (ML-ABC) approach that adjusts the food source with the most beneficial results for the candidate by considering a mutual learning factor that incorporates the initial weight. We evaluated the efficacy of our novel approach by contrasting its results with those of population-based techniques using three standard datasets, namely Stanford Natural Language Inference (SNLI), Microsoft Research Paraphrase Corpus (MSRP), and Semantic Evaluation Database (SemEval2014). Our model attained excellent results that outperformed state-of-the-art models. Optimal values for important parameters, including reward function are identified for the model based on experiments on the study dataset. Ablation studies that exclude the proposed ML-ABC and reinforcement learning from the model confirm the independent positive incremental impact of these components on model performance. |
Keywords | Artificial bee colony; Plagiarism detection; Unbalanced classification; Bidirectional encoder representations from transformers; Reinforcement learning |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 420308. Health informatics and information systems |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Byline Affiliations | University of Malaya, Malaysia |
Air University Islamabad, Pakistan | |
Benazir Bhutto Shaheed University Lyari, Pakistan | |
Deakin University | |
School of Mathematics, Physics and Computing |
Permalink -
https://research.usq.edu.au/item/z5vz5/efficient-reinforcement-learning-based-method-for-plagiarism-detection-boosted-by-a-population-based-algorithm-for-pretraining-weights
59
total views0
total downloads0
views this month0
downloads this month