StyloAI: Distinguishing AI-Generated Content with Stylometric Analysis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The emergence of large language models (LLMs) capable of generating realistic texts and images has sparked ethical concerns across various sectors. In response, researchers in academia and industry are actively exploring methods to distinguish AI-generated content from human-authored material. However, a crucial question remains: What are the unique characteristics of AI-generated text? Addressing this gap, this study proposes StyloAI, a data-driven model that uses 31 stylometric features to identify AI-generated texts by applying a Random Forest classifier on two multi-domain datasets. StyloAI achieves accuracy rates of 81% and 98% on the test set of the AuTextification dataset and the Education dataset, respectively. This approach surpasses the performance of existing state-of-the-art models and provides valuable insights into the differences between AI-generated and human-authored texts.
Original languageEnglish
Title of host publicationProceedings of 25th International Conference on Artificial on Artificial Intelligence in Education(AIED 2024)
DOIs
Publication statusPublished - 2 Jul 2024
Event25th International Conference on Artificial on Artificial Intelligence in Education - Recife, Brazil
Duration: 8 Jul 202412 Jul 2024
https://aied2024.cesar.school/

Conference

Conference25th International Conference on Artificial on Artificial Intelligence in Education
Abbreviated titleAIED 2024
Country/TerritoryBrazil
CityRecife
Period8/07/2412/07/24
Internet address

Fingerprint

Dive into the research topics of 'StyloAI: Distinguishing AI-Generated Content with Stylometric Analysis'. Together they form a unique fingerprint.

Cite this