DTL 2015

DTLCONFERENCES

DTL 2015 Travel Grants.

DTL Travel Grants 2015

The following proposals are amongst the top-third of all proposals and were offered a platform to share their ideas and work with other members of the DTL community. They were offered a presentation slot at the forthcoming DTL workshop in November 2015 (location and details will be announced in due time) as well as a travel grant to attend it.

A Deep learning platform for the reverse-engineering of Behavioral Targeting procedures in online ad networks (DeepBET)

Sotirios Chatzis (Cyprus University of Technology), Aristodemos Paphitis (Cyprus University of Technology)

Online ad networks are a characteristic example of online services that massively leverage user data for the purposes of behavioral targeting. A significant problem of these technologies is their lack of transparency. For this reason, the problem of reverse-engineering the behavioral targeting mechanisms of ad networks has recently attracted significant research interest. Existing approaches query ad networks using artificial user profiles, each of which pertains to a single user category. Nevertheless, well-designed ad services may not rely on such simple user categorizations: A user assigned to multiple categories may be presented with a set of ads quite different from the union of the set of ads pertaining to each one of their individual interests. Even more importantly, user interests may change or vary over time. Nevertheless, none of the existing reverse-engineering systems are capable of determining whether and how ad network targeting mechanisms adapt to such temporal dynamics.

The goal of this proposal is to develop a platform addressing these inadequacies by leveraging advanced machine learning methods. The proposed platform is capable of: (i) Intelligently creating a diverse set of (interest-based) user profiles to query ad networks with. It ensures that the (artificial) user profiles used to query the analyzed ad networks correspond to as diverse a set of combinations of user interests (characteristics) as possible. (ii) Obviating the need to rely on some publicly available tree of categories/user interests, as this can be restrictive to the analysis or even misleading. Instead, our platform is capable of reliably producing a tree-like content-based grouping (clustering) of websites into interest groups, in a completely unsupervised manner. (iii) Performing inference of the correlations between user characteristics and ad network outputs in a way that allows for large scale generalization. (iv) Determining whether and how temporal dynamics affect these correlations, and on how long temporal horizons.

Alibi: Turning User Tracking Into a User Benefit

Marcel Flores, Andrew Kahn, Marc Warrior, Aleksandar Kuzmanovic (PI) (Northwestern University))

We propose Alibi, a system that enables users to take direct advantage of the work online trackers do to record and interpret their behavior. The key idea is to use the readily available personalized content, generated by online trackers in real-time, as a means to verify an online user in a seamless and privacy-preserving manner. We propose to utilize such tracker-generated personalized content, submitted directly by the user, to construct a multi-tracker user-vector representation and use it in various online verification scenarios. The main research objectives of this project are to explore the fundamental properties of such user-vector representations, i.e., their construction, uniqueness, persistency, resilience, utility in online verification, etc. The key goal of this project is to design, implement, and evaluate the Alibi service, and make it publicly available.

Towards Making Systems Forget

Yinzhi Cao (Lehigh University and Columbia University)

Today’s systems produce a rapidly exploding amount of data, and the data further derives more data, forming a complex data propagation network that we call the data’s lineage. There are many reasons that users want systems to forget certain data including its lineage. From a privacy perspective, users who become concerned with new privacy risks of a system often want the system to forget their data and lineage. From a security perspective, if an attacker pollutes an anomaly detector by injecting manually crafted data into the training data set, the detector must forget the injected data to regain security. From a usability perspective, a user can remove noise and incorrect entries so that a recommendation engine gives useful recommendations. Therefore, we envision forgetting systems, capable of forgetting certain data and their lineages, completely and quickly.

In this proposal, we focus on making learning systems forget, the process of which we call machine unlearning, or simply unlearning. We present a general, efficient unlearning approach by transforming learning algorithms used by a system into a summation form. To forget a training data sample, our approach simply updates a small number of summations’ asymptotically faster than retraining from scratch. Our approach is general, because the summation form is from the statistical query learning in which many machine learning algorithms can be implemented. Our approach also applies to all stages of machine learning, including feature selection and modeling.

Bringing Fairness and Transparency to Mobile On-Demand Services

Christo Wilson (Northeastern University), Dave Choffnes (Northeastern University), Alan Mislove (Northeastern University)

In this project, we aim to bring greater transparency to algorithmic pricing implemented by mobile, on-demand services. Algorithmic pricing was pioneered in this space by Uber in the form of “surge pricing”. While we applaud mobile, on-demand services for disrupting incumbents and stimulating moribund sectors of the economy, we also believe that the data and algorithms leveraged by these services should be transparent. Fundamentally, consumers and providers cannot make informed choices when marketplaces are opaque. Furthermore, black-box services are vulnerable to exploitation once their algorithms are understood, which creates opportunities for customers and providers to manipulate these services in ways that are not possible in transparent markets.

Providing Users With Feedback on Search Personalised Learning

Douglas Leith (Trinity College Dublin), Alessandro Checco (Trinity College Dublin)

Users are currently given only very limited feedback from search providers as to what learning and inference of personal preferences is taking place. When a search engine infers that a particular advertising category is likely to be of interest to a user, and so more likely to generate click through and sales, it will tend to use this information when selecting which adverts to display. This can be used to detect search engine learning via analysis of changes in the choice of displayed adverts and to inform the user of this learning. In this project we will develop a browser plugin that provides such feedback, essentially by empowering the user via the kind of data analytic techniques used by the search engines themselves.

Zero-Knowledge Transparency: Safe Audit Tools for End Users

Maksym Gabielkov (INRIA, Columbia University), Larissa Navarro Passos de Araujo (Columbia University), Max Tucker Da Silva (Columbia University), Augustin Chaintreau (Columbia University)

Users are currently given only very limited feedback from search providers as to what learning and inference of personal preferences is taking place. When a search engine infers that a particular advertising category is likely to be of interest to a user, and so more likely to generate click through and sales, it will tend to use this information when selecting which adverts to display. This can be used to detect search engine learning via analysis of changes in the choice of displayed adverts and to inform the user of this learning. In this project we will develop a browser plugin that provides such feedback, essentially by empowering the user via the kind of data analytic techniques used by the search engines themselves.

Privacy-aware ecosystem for data sharing

Anna Monreale (Department of Computer Science, University of Pisa)

Human and social data are an important source of knowledge useful for understanding human behaviour and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s activities described by these data may allow re-identification of individuals in a de-identified database and thus can potentially reveal intimate personal traits, such as religious or sexual preferences. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a tradeoff. This project proposes a framework to support the Data Provider in the privacy risk assessment of data to be shared. This framework measures both the empirical (not theoretical) privacy risk associated to users represented in the data and the data quality guaranteed only with users not at risk. It provides a mechanism allowing the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. The project will focus on mobility data studying the practical effectiveness of the framework over forms of mobility data required by specific knowledge-based services.

Exposing and Overcoming Privacy Leakage in Mobile Apps using Dynamic Profiles

Z. Morley Mao (University of Michigan)

In this proposal, we focus on designing the support to detect the leakage of personal data in the mobile app ecosystem through a novel approach of using dynamically generated user’s application profiles to track how sensitive data influence the content presented to the users and also to discover the violation of user privacy policies. For the former, we analyze how various types of content personalization based on information such as behavior, context or location, social graph can lead to potentially unwanted bias in the content. For the latter, we take a semantic based approach to translate the user privacy preference into enforceable syntax-based mechanisms. By leveraging the dynamically generated profiles that characterize the expected content customization, users can select a type of profile that satisfy user’s privacy policy or obtain data or access the online service through a collection of profiles. In summary, our work consists of both offline approaches for generating the knowledge of content customization based on the relevant pro- files and to characterize the privacy-related behavior in mobile apps, as well as the run-time enforcement support to satisfy user-expressed privacy policies.

Detecting Accidental and Intentional PII Leakage from Modern Web Applications

Nick Nikiforakis (Stony Brook University)

The rise of extremely popular online services offered at no fiscal cost to users has given rise to a rich online ecosystem of third party trackers and online advertisers. While the majority of tracking involves the use of cookies and other technologies that do not, directly, expose a user’s personally identifiable information (PII), past research has shown that PII leakage is all too common. Either due to poor programming practices (e.g. PII-carrying, GET-submitting forms) or due to intentional information leakage, a user’s PII often finds its way to the hands of third parties. In the cases where a user’s PII leaks towards third parties that already use cookies and other tracking technologies, the trackers have now the potential to identify the user, by name, as she browses on the web.

Despite the magnitude and the severity of the PII-leakage problem, there is, currently, a dearth of usable, privacy-enhancing technologies that detect and prevent PII leakage. To restore the control of users over their own personally identifiable information, we propose to design, implement, and evaluate LeakSentry, a browser extension that has the ability to identify leakage as that is happening and give users contextual information about the leakage as well as the power to allow it, or block it. Next to LeakSentry’s stand-alone mode, users of LeakSentry will be able to opt-in to a crowd-wisdom program where they can learn from each other’s choices. In addition, LeakSentry will have the ability to report the location of PII leakage, enabling us to create a PII-leaking page observatory, which can both apply pressure to the websites that were caught red-handed, as well as navigate other users away from them.

Towards Transparent Privacy Practices: Facilitating Comparisons of Privacy Policies

Ali Sunyaev (Department of Information Systems, University of CologneUniversity of Cologne), Tobias Dehling (Department of Information Systems, University of Cologne)

A central challenge of privacy policy design is the wicked nature of privacy policies: In essence, privacy policies are past responses of providers to future information requests of users regarding the privacy practices of online services. As a result, today’s privacy policies feature a large variety of contents and designs. This impedes data transparency, in particular, with respect to comparisons of privacy practices between providers. The main idea of this research proposal is to leverage tagging and crowdsourcing to facilitate comparisons of privacy policies in a provider-independent web application. Our research is relevant for data transparency research because it aims to improve the most prevalent tool for shedding light into the use of personal data by online services, that is, privacy policies. Redeeming the benefits offered by online environments while avoiding the perils is challenging, this research proposal makes this task easier by improving transparency of privacy practices. There have been numerous efforts to improve the utility of privacy policies that focus on reshaping the privacy policies offered by providers, for instance, changing the layout or enhancing visualization. The main innovation pursued in this research proposal is that we do not focus on getting providers to publish better privacy policies, but instead focus on enabling users to make the best out of the privacy policies providers confront them with.

Improving the Comprehension of Browser Privacy Modes

Sascha Fahl (DCSec, Leibniz Universität Hannover), Yasemin Acar (DCSec, Leibniz Universität Hannover), Matthew Smith (Rheinische Friedrich-Wilhelms-Universität Bonn)

Online privacy is an important, hotly researched and demanded topic that gained even more relevance recently. However, existing mechanisms that protect users’ privacy online, such as TOR and using VPN connections are complex, bring performance issues with them and, in case of the latter, add costs. Therefore, their widespread use is not applicable for the public. Browser vendors have recently established so-called private browsing modes that are largely misunderstood by users: They over-rate the level of protection offered by the services, which can lead to insecure behaviour. We aim to study user misconceptions, enhance their comprehension and scientifically evaluate the usability and applicability of more privacy-enhancing services such as TOR.

PRIVA-SEE: PRIVacy Aware visual SEnsitivity Evaluator

Bruno Lepri (Fondazione Bruno Kessler), Elisa Ricci (Fondazione Bruno Kessler), Lorenzo Porzi (Fondazione Bruno Kessler)

Digitally sharing our lives with others is a captivating and often addictive activity. Nowadays 1.8 billion photos are shared daily on social media. These images hold a wealth of personal information, ripe for exploitation by tailored advertising business models, but placed in the wrong hands this data can lead to disaster. In this project, we want to see how the increasing of a person’s awareness about potential personal data sensitivity issues influences their decisions about what and how to share, and moreover, how valuable they perceive their personal data to be. To achieve this ambitious goal we aim to

(i) develop a novel methodology, applied within a mobile app, to inform users about the potential sensitivity of their images. Sensitivity will be modeled by exploiting automatic inferences coming from advanced computer vision and deep learning algorithms applied to personal photos and associated metadata;

(ii) perform user-­centric studies within a living-lab environment to assess how users’ posting behaviours and monetary valuation of mobile personal data are influenced by user awareness about content sharing risks.

Bringing Transparency to Targeted Advertising

Patrick Loiseau (EURECOM), Oana Goga (MPI-SWS)

Targeted advertising largely contributes to the support of free web services. However, it is also increasingly raising concerns from users, mainly due to its lack of transparency. The objective of this proposal is to increase the transparency of targeted advertising from the user’s point of view by providing users with a tool to understand why they are targeted with a particular ad and to infer what information the ad engines possibly have about them. Concretely, we propose to build a browser plugin that collects the ads shown to a user and provides her with analytics about these ads.

Exploring Personal Data on the Databox

Hamed Haddadi (QMUL)

We are in a personal data gold rush driven by advertising being the primary revenue source for most online companies. These companies accumulate extensive personal data about individuals with minimal concern for us, the subjects of this process. This can cause many harms: privacy infringement, personal and professional embarrassment, restricted access to labour markets, restricted access to best value pricing, and many others. There is a critical need to provide technologies that enable alternative practices, so that individuals can participate in the collection, management and consumption of their personal data.We are developing the Databox, a personal networked device (and associated services) that collates and mediates access to personal data, allowing us to recover control of our online lives. We hope the Databox is a first step to re-balancing power between us, the data subjects, and the corporations that collect and use our data.

DTL2017

You can now check the Program and confirmed speakers of DTL Conference 2017 to take place from 11-12th December in Barcelona. If you are a student or academic you can now get your “Student ticket. Join our Newsletter to stay up to date!

This website uses cookies to improve user experience. By using our website you consent to all cookies in accordance with our Cookie Policy.

OK More information