REPHRAIN Publications

A list of the REPHRAIN Centre publications can be found below – please check back for regular updates.

July 2022

Taking Situation-Based Privacy Decisions: Privacy Assistants Working with Humans

Prepared by Nadin Kokciyan & Pinar Yolum

Abstract: Privacy on the Web is typically managed by giving consent to individual Websites for various aspects of data usage. This paradigm requires too much human effort and thus is impractical for Internet of Things (IoT) applications where humans interact with many new devices on a daily basis. Ideally, software privacy assistants can help by making privacy decisions in different situations on behalf of the users. To realize this, we propose an agent-based model for a privacy assistant. The model identifies the contexts that a situation implies and computes the trustworthiness of these contexts. Contrary to traditional trust models that capture trust in an entity by observing large number of interactions, our proposed model can assess the trust-worthiness even if the user has not interacted with the particular device before. Moreover, our model can decide which situations are inherently ambiguous and thus can request the human to make the decision. We evaluate various aspects of the model using a real-life data set and report adjustments that are needed to serve different types of users well.

Paper available for download here.

June 2022

Safeguarding Privacy in the Age of Everyday XR

Prepared by Pejman Saeghe, Mark McGill & Mohamed Khamis

Abstract: The commercialisation of extended reality (XR) devices provides new capabilities for its user, such as the ability to continuously capture their surroundings. This introduces novel privacy risks and challenges for XR users and bystanders alike. In this position paper, we use an established taxonomy of privacy to highlight its limitations when dealing with everyday XR. Our aim is to highlight a need for an update in our collective understanding of privacy risks imposed by everyday XR technology.

Paper available for download here.

 MuMiN: A Large Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset

Prepared by Dan S. Nielsen & Ryan McConville

Abstract: Misinformation is becoming increasingly prevalent on social media and in news articles. It has become so widespread that we require algorithmic assistance utilising machine learning to detect such content. Training these machine learning models require datasets of sufficient scale, diversity and quality. However, datasets in the field of automatic misinformation detection are predominantly monolingual, include a limited amount of modalities and are not of sufficient scale and quality. Addressing this, we develop a data collection and linking system (MuMiN-trawl), to build a public misinformation graph dataset (MuMiN), containing rich social media data (tweets, replies, users, images, articles, hashtags) spanning 21 million tweets belonging to 26 thousand Twitter threads, each of which have been semantically linked to 13 thousand fact-checked claims across dozens of topics, events and domains, in 41 different languages, spanning more than a decade. The dataset is made available as a heterogeneous graph via a Python package (mumin). We provide baseline results for two node classification tasks related to the veracity of a claim involving social media, and demonstrate that these are challenging tasks, with the highest macro-average F1- score being 62.55% and 61.45% for the two tasks, respectively. The MuMiN ecosystem is available at, including the data, documentation, tutorials and leaderboards.

Paper available for download here.

January 2022

Multi-party Updatable Delegated Private Set Intersection

Prepared by Aydin Abadi, Changyu Dong, Steven Murdoch and Sotirios Terzis

Abstract: With the growth of cloud computing, the need arises for Private Set Intersection protocols (PSI) that can let parties outsource the storage of their private sets and securely delegate PSI computation to a cloud server. The existing delegated PSIs have two major limitations; namely, they cannot support (1) efficient updates on outsourced sets and (2) efficient PSI among multiple clients. This paper presents “Feather”, the first lightweight delegated PSI that addresses both limitations simultaneously. It lets clients independently prepare and upload their private sets to the cloud once, then delegate the computation an unlimited number of times. We implemented Feather and compared its costs with the state of the art delegated PSIs. The evaluation shows that Feather is more efficient computationally, in both update and PSI computation phases.

Paper available for download here.

December 2021

A Consumer Law Perspective on the Commercialization of Data

Prepared by Mateja Durovic and Franciszek Lech

Abstract: Commercialization of consumers’ personal data in the digital economy poses serious, both conceptual and practical, challenges to the traditional approach of European Union (EU) Consumer Law. This article argues that mass-spread, automated, algorithmic decision-making casts doubt on the foundational paradigm of EU consumer law: consent and autonomy. Moreover, it poses threats of discrimination and under- mining of consumer privacy. It is argued that the recent legislative reaction by the EU Commission, in the form of the ‘New Deal for Consumers’, was a step in the right direction, but fell short due to its continued reliance on consent, autonomy and failure to adequately protect consumers from indirect discrimination. It is posited that a focus on creating a contracting landscape where the consumer may be properly informed in material respects is required, which in turn necessitates blending the approaches of competition, consumer protection and data protection laws.

Paper available for download here.

October 2021

Building a Privacy Testbed: Use Cases and Design Considerations

Prepared by Joseph Gardiner, Partha Das Chowdhury, Jacob Halsey, Mohammad Tahaei, Tariq Elahi and Awais Rashid.

Abstract: Mobile application (app) developers are often ill-equipped to understand the privacy implications of their products and services, especially with the common practice of using third-party libraries to provide critical functionality. To add to the complexity, most mobile applications interact with the “cloud”—not only the platform provider’s ecosystem (such as Apple or Google) but also with third-party servers (as a consequence of library use). This presents a hazy view of the privacy impact for a particular app. Therefore, we take a significant step to address this challenge and propose a testbed with the ability to systematically evaluate and understand the privacy behavior of client server applications in a network environment across a large number of hosts. We reflect on our experiences of successfully deploying two mass market applications on the initial versions of our proposed testbed. Standardization across cloud implementations and exposed end points of closed source binaries are key for transparent evaluation of privacy features.

Paper available for download here.

September 2021

A Privacy Testbed for IT Professionals: Use Cases and Design Considerations

Prepared by Joseph Gardiner, Mohammad Tahaei, Jacob Halsey, Tariq Elahi and Awais Rashid

Abstract: We propose a testbed to assist IT professionals in evaluating privacy properties of software systems. The goal of the testbed, currently under construction, is to help IT professionals systematically evaluate and understand the privacy behaviour of applications. We first provide three use cases to support developers and privacy engineers and then describe key design considerations for the testbed.

Paper available for download here.

August 2021

Polynomial Representation Is Tricky: Maliciously Secure Private Set Intersections Revisited

Prepared by Aydin Abadi, Steven Murdoch, Thomas Zacharias

Abstract: Private Set Intersection protocols (PSIs) allow parties to compute the intersection of their private sets, such that nothing about the sets’ elements beyond the intersection is revealed. PSIs have a variety of applications, primarily in efficiently supporting data sharing in a privacy-preserving manner. At Eurocrypt 2019, Ghosh and Nilges proposed three efficient PSIs based on the polynomial representation of sets and proved their security against active adversaries. In this work, we show that these three PSIs are susceptible to several serious attacks. The attacks let an adversary (1) learn the correct intersection while making its victim believe that the intersection is empty, (2) learn a certain element of its victim’s set beyond the intersection, and (3) delete multiple elements of its victim’s input set. We explain why the proofs did not identify these attacks and propose a set of mitigations.

Paper available for download here.

March 2021

Towards Data Scientific Investigations: A Comprehensive Data Science Framework and Case Study for Investigating Organized Crime and Serving the Public Interest

Prepared by Erik van de Sandt, Arthur van Bunningen, Jarmo van Lenthe, John Fokker

Abstract: Big Data problems thwart the effectiveness of today’s organized crime investigations. A frequently proposed solution is the introduction of ‘smart’ data science technologies to process raw data into factual evidence. This transition to – what we call – data scientific investigations is nothing less than a paradigm shift for law enforcement agencies, and cannot be done alone. Yet a common language for data scientific investigations is so far missing. This white paper therefore presents guiding principles and best practices for data scientific investigations of organized crime, developed and put into practice by operational experts over several years, while connecting to existing law enforcement and industry standards. The associated framework is called CSAE (pronounced as ‘see-say’): a comprehensive framework that consists of a business process, methodology, policy agenda and public interest philosophy for data scientific operations.

Paper available for download here.