Decentralised Moderation for Interoperable Social Networks:A Conversation-based Approach for Pleroma and the Fediverse

Publications

Decentralised Moderation for Interoperable Social Networks:A Conversation-based Approach for Pleroma and the Fediverse

Agarwal, S., Nithyanand, R., Stringhini, G., & Zannettou, S.

Abstract

The recent development of decentralised and interoperablesocial networks (such as the “fediverse”) creates new chal-lenges for content moderators. This is because millions ofposts generated on one server can easily “spread” to another,even if the recipient server has very different moderationpolicies. An obvious solution would be to leverage moder-ation tools to automatically tag (and filter) posts that contra-vene moderation policies,e.g.related to toxic speech. Recentwork has exploited the conversational context of a post to im-prove this automatic tagging,e.g.using the replies to a postto help classify if it contains toxic speech. This has shownparticular potential in environments with large training setsthat contain complete conversations. This, however, createschallenges in a decentralised context, as a single conversa-tion may be fragmented across multiple servers. Thus, eachserver only has a partial view of an entire conversation be-cause conversations are often federated across servers in anon-synchronized fashion. To address this, we propose a de-centralised conversation-aware content moderation approachsuitable for the fediverse. Our approach employs a graph deeplearning model (GraphNLI) trained locally on each server.The model exploits local data to train a model that combinespost and conversational information captured through ran-dom walks to detect toxicity. We evaluate our approach withdata from Pleroma, a major decentralised and interoperablemicro-blogging network containing2million conversations.Our model effectively detects toxicity on larger instances, ex-clusively trained using their local post information (0.8837macro-F1). Yet, we show that this approach does not performwell on smaller instances that do not possess sufficient localtraining data. Thus, in cases where a server contains insuf-ficient data, we strategically retrieve information (posts ormodel parameters) from other servers to reconstruct largerconversations and improve results. With this, we show thatwe can attain a macro-F1 of0.8826. Our approach has con-siderable scope to improve moderation in decentralised andinteroperable social networks such as Pleroma or Mastodon.
Link to Paper