Using AI moderation tools

Ben Balter recently announced a new tool he created: AI Community Moderator. This project, written by an AI coding assistant at Balter’s direction, takes moderation action in GitHub repositories. Using any AI model supported by GitHub, it automatically enforces a project’s code of conduct and contribution guidelines. Should you use it for your project?
For the sake of this post, I’m assuming that you’re open to using large language model tools in certain contexts. If you’re not, then there’s nothing to discuss.
Why to not use AI moderation tools
Moderating community interactions is a key part of leading an open source project. Good moderation creates a safe and welcoming community where people can do their best work. Bad moderation drives people away — either because toxic members are allowed to run roughshod over others or because good-faith interactions are given heavy-handed punishment. Moderation is one of the most important factors in creating a sustainable community — people have to want to be there.
Moderation is hard — and often thankless — work. It requires emotional energy in addition to time. I understand the appeal of offloading that work to AI. AI models don’t get emotionally invested. They can’t feel burnout. They’re available around the clock.
But they also don’t understand a community’s culture. They can’t build relationships with contributors. They’re not human. Communities are ultimately a human endeavor. Don’t take the humanity out of maintaining your community.
Why you might use AI moderation tools
Having said the above, there are cases where AI moderation tools can help. In a multilingual community, moderators may not have fluency in all of the languages people use. Anyone who has used AI translations know they can sometimes be hilariously wrong, but they’re (usually) better than nothing.
AI tools are also ever-vigilant. They don’t need sleep or vacations and they don’t get pulled away by their day job, family obligations, or other hobbies. This is particularly valuable when a community spans many time zones and the moderation team does not.
Making a decision for your project
“AI” is a broad term, so you shouldn’t write off everything that has that label. Machine learning algorithms can be very helpful in detecting spam and other forms of antisocial behavior. The people who I’ve heard express moral or ethical objections to large language models seem to generally be okay with machine learning models in appropriate contexts.
Using spam filters and other abuse detection tools to support human moderators is a good thing. It’s reasonable to allow them to take basic reversible actions, like hiding a post until a human has had the chance to review it. However, I don’t recommend using AI models to take more permanent actions or to interact with people who have potentially violated your project’s code of conduct. It’s hard, but you need to keep the humanity in your community.
This post’s featured photo by Mohamed Nohassi on Unsplash.