Hate Content on Social Media and Artificial Intelligence

Yi Liu, Pinar Yildirim, and Z. John Zhang, Marketing, The Wharton School

Abstract: Social media platforms now start largely relying on artificial intelligence (AI) technologies to automatically detect and remove UGC believed to be harmful. Unfortunately, such detection technologies are less than perfect, and as the CEO of Facebook Mark Zuckerberg says, “removing hateful content from the site [is] difficult and beyond the capacity of artificial intelligence.” Using algorithms for content detection and removal naturally suffer from Type-I (false positives, TP) and Type II (false negatives, FN) errors. Stricter screening and content removal algorithms prevent harmful content from spreading, but in the meanwhile may result in take-down of falsely-tagged proper content, and have disparate effects on consumers, frustrating users. On the other hand, more lenient algorithms reduce the problem of falsely-tagging proper content, meanwhile keeping the harmful content problem alive. Naturally, questions rise about how strict or lenient social media platforms should be in content screening and removal to maintain profitability. In a more general sense, we are asking questions related to the use of AI content moderators: Should social media platforms use AI in content moderation, compared to other methods such as human moderation and group flagging? Also, AI technology is advancing. What’s the impact of this improvement in detecting technology? Would social platforms benefit from it? Would people see less harmful content online? These questions will be answered through both experimental evidence and game-theoretical models.