Defogger: A Visual Analysis Approach for Data Exploration
of Sensitive Data Protected by Differential Privacy
Abstract
Differential privacy ensures the security of individual privacy but poses challenges to data exploration processes because the limited privacy budget incapacitates the flexibility of exploration and the noisy feedback of data requests leads to confusing uncertainty. In this study, we take the lead in describing corresponding exploration scenarios, including underlying requirements and available exploration strategies. To facilitate practical applications, we propose a visual analysis approach to the formulation of exploration strategies. Our approach applies a reinforcement learning model to provide diverse suggestions for exploration strategies according to the exploration intent of users. A novel visual design for representing uncertainty in correlation patterns is integrated into our prototype system to support the proposed approach. Finally, we implemented a user study and two case studies. The results of these studies verified that our approach can help develop strategies that satisfy the exploration intent of users.
keywords:
Differential privacy, Visual data analysis, Data exploration, Visualization for uncertainty illustration1438
\vgtccategoryResearch
\vgtcpapertypeAnalytics & Decisions
\authorfooter
Xumeng Wang and Shuangcheng Jiao are with DISSec, Nankai University.
E-mail: wangxumeng@nankai.edu.cn, jiaoshuangcheng@mail.nankai.edu.cn.
Chris Bryan is with SCAI, Arizona State University.
E-mail: cbryan16@asu.edu.
Xumeng Wang is the corresponding author.
\teaser
The interface of Defogger for exploring data guarded by differential privacy. (a) The information reservation view invites users to specify data information (e.g., distributions and correlations) of interest as their exploration intent (a1) and describe available data facts to generate simulated data (a2) for exploration strategy recommendation. (b) The data request declaration view allows users to declare a data request (b1) based on recommendations from a reinforcement learning model (b2). (c) The uncertainty illustration view explains the uncertainty caused by differential privacy in the simulated or the actual response of the declared data request.