Detecting the Detectors: A Systematic Review of AI-Generated Text Detection Tools and Their Reliability
Keywords:
Artificial Intelligence, Generative AI, AI-Generated Text Detection, AI Ethics
Abstract
This study systematically investigates the reliability of AI-Generated Text Detection Tools through a PRISMA-based systematic review combined with bibliometric analysis. Using the SCOPUS database, 949,127 records related to Artificial Intelligence (AI), Generative AI, AI- Generated Text Detection, and AI Ethics were identified and filtered through identification, screening, and eligibility stages. A total of 73 qualified studies were included for synthesis. The PRISMA framework ensured methodological transparency, while bibliometric analysis using VOSviewer and the R Bibliometrix Package revealed publication trends, key contributors, and research networks. The analysis identified Artificial Intelligence, ChatGPT, and Generative AI as core topics strongly associated with research integrity, ethics, and detection reliability. Findings indicate that BERT-based and graph neural network models show high accuracy in distinguishing AI-generated text but remain inconsistent across linguistic and contextual variations. Bibliometric mapping uncovered five major research clusters—AI Ethics, Academic Integrity, Language Models, Detection Methods, and Education—reflecting the interdisciplinary nature of this domain. The study emphasizes that technical precision alone is insufficient; ethical considerations such as transparency, fairness, and accountability are crucial for maintaining reliability and trust in AI detection tools. In conclusion, developing trustworthy AI detectors requires a balanced integration of technical validation and ethical governance. The results highlight a need for continuous refinement of methodologies and stronger alignment with ethical principles to enhance trust, transparency, and research integrity in the era of generative AI.
Published
2026-04-01
Section
Articles