dsnmod
Abstract
Memes have become quite common in day-today communications on social media platforms.
They often appear to be amusing, evoking and
attractive to audiences. However, some memes
containing malicious content can be harmful to
targeted groups. In this paper, we study misogynous meme detection, a shared task in SemEval
2022 - Multimedia Automatic Misogyny Identification (MAMI). The challenge of misogynous
meme detection is to co-represent multi-modal
features. To tackle with this challenge, we propose a Multi-modal Multi-task Variational AutoEncoder (MMVAE) to learn an effective corepresentation of visual and textual features in
the latent space. Our goal is to automatically
determine if a meme contains misogynous information and then identify its fine-grained category. Our model achieves F1 scores of 0.723
on the MAMI sub-task A and 0.634 on sub-task
B. We carry out comprehensive experiments on
our model’s architecture and show that our approach significantly outperforms several strong
uni-modal and multi-modal approaches.