Or Sheffet

Beyond Worst-Case Analysis in Privacy and Clustering: Exploiting Explicit and Implicit Assumptions

Degree Type: Ph.D. in Computer Science
Advisor(s): Avrim Blum
Graduated: August 2013

Keywords: Non Worst-Case Analysis, Differential Privacy, Clustering, Algorithms

Abstract

This thesis can be viewed as a collection of work in differential privacy and in clustering. In its first part we discuss work aimed at preserving differential privacy in a social network, with respect to either the presence/absence of a single edge [41], or with respect to changing all edges adjacent to one node [42]. In its second part we discuss multiple clustering problems, focusing on the k-means and k-median problems. We show how to correctly cluster an instance whose optimal k-means solution either satisfies a certain stability condition [20], or is resilient to small constant metric perturbations [21], or with cluster centers that satisfy a particular separation condition [22].

Alternatively, this thesis can be viewed as an investigation of specific nonworst case analysis paradigms. The common theme among of all results composing this thesis is that they all introduce algorithms whose guarantees are meaningful only for a subset of inputs – for inputs that satisfy certain nice properties, or assumptions. These assumptions can be roughly divided into two types, which we refer to as explicit or implicit. Explicit assumptions give a very specific and quantifiable characterization of the input (e.g. clustering instances with the distance between any pair of cluster centers larger than a specific bound). On the other hand, implicit assumptions are harder to characterize. Implicit assumptions pose a certain property that the input should satisfy, due to some compelling "real-life" reasoning (e.g. justifying a particular value of k for a k-means clustering instance), and often give much leeway as to the particular structure of the input. In this thesis, we exhibit multiple examples of assumptions of both kinds, in differential privacy and clustering, and give algorithms that take advantage of these assumptions. In particular, we show how tasks that are provably hard become feasible under suitable assumptions; tasks like providing accurate answers for queries over graph while preserving privacy on the node level, or giving a c-approximation for the k-median objective for c < 1 + e^-1.

Thesis Committee

Avrim Blum (Chair)
Anupam Gupta
Venkatesan Guruswami
Kunal Talwar (Microsoft Research)

Frank Pfenning, Head, Computer Science Department
Randy Bryant, Dean, School of Computer Science

Thesis Document

CMU-CS-13-120.pdf (1.56 MB) (178 pages)

At a Glance

Academic Offerings

Admissions

Directory Submenu

People

Explore the Field

Or Sheffet

Beyond Worst-Case Analysis in Privacy and Clustering: Exploiting Explicit and Implicit Assumptions

Abstract

Thesis Committee

Thesis Document

At a Glance

Academic Offerings

Admissions

Directory Submenu

People

Explore the Field

What can we help you find?

Or Sheffet

Beyond Worst-Case Analysis in Privacy and Clustering: Exploiting Explicit and Implicit Assumptions

Abstract

Thesis Committee

Thesis Document