REPHRAIN Masterclass – Using Data Science to study jerks on the Web

On 23 September, Emiliano De Cristofaro hosted a masterclass about his experience using data science to research fringe communities online. A copy of the event’s abstract can be found below.

Over the past 20 years or so, the world has seen an explosion of data. While in the past controlled experiments, surveys, or compilation of high-level statistics allowed us to gain insights into the problems we explored, the Web has brought about a host of new challenges for researchers hoping to gain an understanding of modern socio-technical behavior. First, even discovering appropriate data sources is not a straight forward task. Next, although the Web enables us to collect highly detailed digital information, there are issues of availability and ephemerality: simply put, researchers have no control over what data a 3rd party platform collects and exposes, and more specifically, no control over how long that data will remain available. Third, the massive scale and multiple formats data are available in requires creative execution of analysis. Finally, modern socio-technical problems, while related to typical social problems, are fundamentally different, and in addition to posing a research challenge, can also cause disruption in researchers’ personal lives. In this talk, I will discuss how our work has overcome the above challenges. Using concrete examples from our research, I will delve into some of the unique datasets and analyses we have performed, focusing on emerging issues like hate speech, coordinate harassment campaigns, and deplatforming as well as modeling the influence that Web communities have on the spread of disinformation, weaponized memes, etc. Finally, I will discuss how we can design proactive systems to anticipate and predict online abuse and, if time permits, how the “fringe” information ecosystem exposes researchers to attacks by the very actors they study.

A video recording of this masterclass can be found below and a copy of Emiliano’s slides can be found  here.

Please direct any queries regarding this event to – thanks to Emiliano for hosting and all attendees for their contributions!