Yangxinyu Xie, PhD Candidate, University of Pennsylvania
Abstract: As language model providers develop watermarking techniques to identify AI-generated content, important questions arise about their potential impact on non-native English speakers in academic settings. This study examines how proposed watermarking systems might create “digital accents”—systematic biases that could flag legitimate writing assistance used by non-native English speakers as potential AI-generated content. Through analysis of 1,500 TOEFL essays and three distinct levels of AI assistance, we demonstrate how current watermarking techniques could disproportionately impact international students who use AI tools for language learning and writing improvement. We propose a novel detection framework that reduces potential false positive rates by integrating conformal outlier detection techniques in statistics while maintaining detection accuracy, providing a technical foundation for implementing these systems more equitably.

