TUNING OUT HATE SPEECH ON REDDIT: AUTOMATING MODERATION AND DETECTING TOXICITY IN THE MANOSPHERE
Over the past two years social media platforms have been struggling to moderate at scale. At the same time, they have come under fire for failing to mitigate the risks of perceived ‘toxic’ content or behaviour on their platforms. In effort to better cope with content moderation, to combat hate speech, ‘dangerous organisations’ and other bad actors present on platforms, discussion has turned to the role that automated machine-learning (ML) tools might play. This paper contributes to thinking about the role and suitability of ML for content moderation on community platforms such as Reddit and Facebook. In particular, it looks at how ML tools operate (or fail to operate) effectively at the intersection between online sentiment within communities and social and platform expectations of acceptable discourse. Through an examination of the r/MGTOW subreddit we problematise current understandings of the notion of ‘tox¬icity’ as applied to cultural or social sub-communities online and explain how this interacts with Google’s Perspective tool.