TemporalXAI-Det: Temporal-Aware Explainable Detection of Multi-Model AI-Generated Academic Text via Continual Learning and Cross-Lingual Transfer

Imeldawaty Gultom; Ratih Puspadini; Fauzi Erwis; Elyandri Prasiwiningrum; Ridwan; Fitra Yuda

doi:10.56313/jictas.v5i1.531

Authors

Imeldawaty Gultom System Information, STMIK, Kaputama, Medan
Ratih Puspadini System Information, STMIK, Kaputama, Medan
Fauzi Erwis Computer Sciences, Universitas Rokania, Riau, Indonesia
Elyandri Prasiwiningrum Computer Sciences, Universitas Rokania, Riau, Indonesia
Ridwan Computer Sciences, Universitas Rokania, Riau, Indonesia
Fitra Yuda Institut Teknologi Rokan Hilir

DOI:

https://doi.org/10.56313/jictas.v5i1.531

Keywords:

Continual Learning Multi-Source AI Detection, Explainable AI Cross-Lingual Transfer, Temporal Model Drift Academic Integrity, XLM-RoBERTa, Catastrophic Forgetting

Abstract

The proliferation of heterogeneous generative AI systems—including GPT-4o, Claude 3 Opus, Gemini 1.5 Pro, Mistral, and LLaMA-3—has produced a multi-source academic text landscape whose detection presents challenges qualitatively beyond those addressed by existing binary or single-source detection paradigms. Contemporary detectors are doubly compromised: first, by adversarial paraphrasing that disrupts surface-level distributional signatures; second, by temporal model drift, wherein new model generations evade detectors trained on earlier LLM families. This study introduces TemporalXAI-Det, a continual-learning explainable detection framework capable of (1) attributing academic text to one of five generative model families while simultaneously identifying human authorship, yielding a six-class taxonomy; (2) adapting to new LLM generations without catastrophic forgetting via Elastic Weight Consolidation (EWC) and experience replay; (3) transferring robustly across twelve academic languages through a Language-Adaptive Prefix Tuning (LAPT) mechanism applied to XLM-RoBERTa-XL; and (4) generating legally defensible per-instance explanations via Integrated Gradients (IG), SHAP, and counterfactual generation. A large-scale continual benchmark corpus (MTA-72K) comprising 72,000 samples across six source classes, four adversarial attack paradigms, and twelve languages is constructed and released. TemporalXAI-Det achieves a six-class macro F1-score of 0.941 on the clean test partition, 0.912 under combined adversarial conditions (performance degradation ? = 2.9 pp), and a mean cross-lingual F1 of 0.887 across all twelve evaluated languages. Continual learning experiments demonstrate that catastrophic forgetting is reduced by 78.4% relative to standard fine-tuning when new LLM families are introduced. These results establish new state-of-the-art benchmarks for multi-source, temporally robust, and multilingual AI-text detection in academic integrity contexts

References

E. Oktafanda, A. Lubis, and E. Prasiwiningrum, “Detection of Oil Palm Seedling Disease Based on Leaf Images Using the MobileNetV2-CNN Architecture,” International Journal of Informatics and Computation (IJICOM), vol. 7, no. 1, p. 2025, 2025, doi: 10.35842/ijicom.

H. Z. Yuan, K. H. Ghazali, A. Lubis, S. Sunardi, and B. Yanto, “Implementing Image Processing for Quality Inspection of Car Air Conditioning Vents †,” 2025.

W. Wang, R. Wang, L. Wang, Z. Wang, and A. Ye, “Towards a Robust Deep Neural Network Against Adversarial Texts: A Survey,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 3, 2023, doi: 10.1109/TKDE.2021.3117608.

L. Frigau, M. Romano, M. Ortu, and G. Contu, “Semi-supervised sentiment clustering on natural language texts,” Stat. Methods Appt., vol. 32, no. 4, 2023, doi: 10.1007/s10260-023-00691-4.

M. Osadebey, Q. Liu, E. Fuster-Garcia, and K. E. Emblem, “Interpreting deep learning models for glioma survival classification using visualization and textual explanations,” BMC Med. Inform. Decis. Mak., vol. 23, no. 1, 2023, doi: 10.1186/s12911-023-02320-2.

J. X. Morris, E. Lifland, J. Y. Yoo, J. Grigsby, D. Jin, and Y. Qi, “TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP,” in EMNLP 2020 - Conference on Empirical Methods in Natural Language Processing, Proceedings of Systems Demonstrations, 2020. doi: 10.18653/v1/2020.emnlp-demos.16.

B. Yanto, A. Supriyanto, S. Riki Mustafa, and K. Jawa Kota Solok, “Pelatihan Peningkatan Inovasi Virtual Reality (Vr) Millealab Bagi Guru Sdn 05 Kampung Jawa Kota Solok,” Communnity Development Journal, vol. 4, no. 2, pp. 1782–1788, 2023.

I. Fursov et al., “A Differentiable Language Model Adversarial Attack on Text Classifiers,” IEEE Access, vol. 10, 2022, doi: 10.1109/ACCESS.2022.3148413.

L. Lisnawita, L. Lhaura Van FC, and L. Costaner, “Pelatihan Editing Gambar dan Text menggunakan Photoshop sebagai bentuk Ekspresi Kreatifitas,” Dinamisia?: Jurnal Pengabdian Kepada Masyarakat, vol. 5, no. 5, pp. 1145–1150, 2022, doi: 10.31849/dinamisia.v5i5.5355.

J. Chen and W. Tao, “Traffic accident duration prediction using text mining and ensemble learning on expressways,” Sci. Rep., vol. 12, no. 1, 2022, doi: 10.1038/s41598-022-25988-4.

Y. Zha, R. Min, and S. Sushmita, “PADBen: A Comprehensive Benchmark for Evaluating AI Text Detectors Against Paraphrase Attacks,” arXiv preprint arXiv:2511.00416, 2025.

W. Zheng and M. Jin, “A review on authorship attribution in text mining,” 2023. doi: 10.1002/wics.1584.

X. Chen and C. Cardie, “Multinomial adversarial networks for multi-domain text classification,” in NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2018. doi: 10.18653/v1/n18-1111.

H. Xu et al., “Adversarial Attacks and Defenses in Images, Graphs and Text: A Review,” 2020. doi: 10.1007/s11633-019-1211-x.

K. Krishna, Y. Song, M. Karpinska, J. Wieting, and M. Iyyer, “Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense,” in Advances in Neural Information Processing Systems, 2023.

A. Uchendu, T. Le, K. Shu, and D. Lee, “Authorship attribution for neural text generation,” in EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2020. doi: 10.18653/v1/2020.emnlp-main.673.

X. He, X. Shen, Z. Chen, M. Backes, and Y. Zhang, “MGTBench: Benchmarking Machine-Generated Text Detection,” in CCS 2024 - Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security, 2024. doi: 10.1145/3658644.3670344.

J. Fleckenstein, J. Meyer, T. Jansen, S. D. Keller, O. Köller, and J. Möller, “Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays,” Computers and Education: Artificial Intelligence, vol. 6, 2024, doi: 10.1016/j.caeai.2024.100209.

T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BERTSCORE: EVALUATING TEXT GENERATION WITH BERT,” in 8th International Conference on Learning Representations, ICLR 2020, 2020.

B. Yanto and R. P. Sari, “Elektronik Pembelajaran Semester (E-RPS) Berbasis Web Fakultas Ilmu Komputer Universitas Pasir Pengaraian,” Riau Journal Of Computer Science, vol. 05, no. 02, 2019.

W. Iskandar Zulkarnain and B. Yanto, “Media Pembelajaran Pendidikan Agama Islam Pada Materi Tata Cara Wudhu Dan Ilmu Tajwid Berbasis Android,” RJOCS (Riau Journal of Computer Science), vol. 8, no. 2, pp. 157–167, 2022, doi: 10.30606/rjocs.v8i2.1768.

D. Z. Zalzabila and B. Yanto, “Media Pembelajaran Mengeja Untuk SD Kelas 1 Berbasis WEB,” Riau Journal of Computer Science, vol. 9, no. 1, pp. 53–57, 2023.

P. P. Santra and D. Majhi, “Scholarly Communication and Machine-Generated Text: Is it Finally AI vs AI in Plagiarism Detection?,” Journal of Information and Knowledge, 2023, doi: 10.17821/srels/2023/v60i3/171028.

D. Ippolito, D. Duckworth, C. Callison-Burch, and D. Eck, “Automatic detection of generated text is easiest when humans are fooled,” in Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.acl-main.164.

S. Gehrmann, H. Strobelt, and A. M. Rush, “GLTR: Statistical detection and visualization of generated text,” in ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations, 2019. doi: 10.18653/v1/p19-3019.

E. Mitchell, Y. Lee, A. Khazatsky, C. D. Manning, and C. Finn, “DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature,” in Proceedings of Machine Learning Research, 2023.

Y. Hacohen-Kerner, N. Manor, M. Goldmeier, and E. Bachar, “Detection of Anorexic Girls-In Blog Posts Written in Hebrew Using a Combined Heuristic AI and NLP Method,” IEEE Access, vol. 10, 2022, doi: 10.1109/ACCESS.2022.3162685.

D. Han et al., “Evaluating and Improving Adversarial Robustness of Machine Learning-Based Network Intrusion Detectors,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 8, 2021, doi: 10.1109/JSAC.2021.3087242.

S. Anwar, I. Nugroho, and A. Ahmadi, “Implementasi Kriptografi Enkripsi Shift Vigenere Chipher Serta Checksum Menggunakan CRC32 Pada Data Text,” Sistem Informasi, vol. 2, pp. 44–50, 2015.

J. Lee et al., “BioBERT: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, 2020, doi: 10.1093/bioinformatics/btz682.

Y. Liu and F. Wan, “Unveiling temporal and spatial research trends in precision agriculture: A BERTopic text mining approach,” Heliyon, vol. 10, no. 17, p. e36808, 2024, doi: 10.1016/j.heliyon.2024.e36808.

N. Berger, S. Riezler, A. Sokolov, and S. Ebert, “Don’t Search for a Search Method - Simple Heuristics Suffice for Adversarial Text Attacks,” in EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, 2021. doi: 10.18653/v1/2021.emnlp-main.647.

M. R. Pribadi, H. D. Purnomo, Hendry, K. D. Hartomo, I. Sembiring, and A. Iriani, “Improving the Accuracy of Text Classification Using the over Sampling Technique in the Case of Sinovac Vaccine,” in International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), 2022. doi: 10.23919/EECSI56542.2022.9946508.

A. Grigorev, A. S. Mihaita, K. Saleh, and M. Piccardi, “Traffic incident duration prediction via a deep learning framework for text description encoding,” in IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 2022. doi: 10.1109/ITSC55140.2022.9921768.

M. Iyyer, J. Wieting, K. Gimpel, and L. Zettlemoyer, “Adversarial example generation with syntactically controlled paraphrase networks,” in NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2018. doi: 10.18653/v1/n18-1170.

Z. Liu et al., “An Adversarial Deep-Learning-Based Model for Cervical Cancer CTV Segmentation With Multicenter Blinded Randomized Controlled Validation,” Front. Oncol., vol. 11, 2021, doi: 10.3389/fonc.2021.702270.

G. Jawahar, B. Sagot, and D. Seddah, “What does BERT learn about the structure of language?,” in ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, 2020. doi: 10.18653/v1/p19-1356.

B. Meskó and E. J. Topol, “The imperative for regulatory oversight of large language models (or generative AI) in healthcare,” NPJ Digit. Med., vol. 6, no. 1, 2023, doi: 10.1038/s41746-023-00873-0.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2019.

TemporalXAI-Det: Temporal-Aware Explainable Detection of Multi-Model AI-Generated Academic Text via Continual Learning and Cross-Lingual Transfer

Authors

DOI:

Keywords:

Abstract

References

Published

How to Cite

Issue

Section