# David Kempe - Publications

• Yu Cheng, Shaddin Dughmi, David Kempe. On the Distortion of Voting with Multiple Representative Candidates. In Proc. of AAAI 2018, New Orleans, LA. Full version on arXiv.

We study positional voting rules when candidates and voters are embedded in a common metric space, and cardinal preferences are naturally given by distances in the metric space. The cost of a candidate is his sum of distances to all voters, and the distortion of an election is the ratio between the cost of the elected candidate and the cost of the optimum candidate. We consider the case when candidates are drawn i.i.d. from the population of the voters, and analyze the expected distortion of positional voting rules.

Our main result is a clean and tight characterization of positional voting rules that have constant expected distortion (independent of the number of candidates and the metric space). Our characterization result immediately implies constant expected distortion for Borda Count and elections in which each voter approves a constant fraction of all candidates. On the other hand, we obtain super-constant expected distortion for Plurality, Veto, and approving a constant number of candidates.

• David Kempe, Leonard J. Schulman, Omer Tamuz. Quasi-regular sequences and optimal schedules for security games. In Proc. of SODA 2018, New Orleans, LA. Full version on arXiv.

We study the following security game: n targets have values a(i). If an attacker has uninterrupted access to target i for t units of time, he gets utility a(i)*t (the defender getting -a(i)*t). If the defender interrupts the attacker, both get utility 0. The defender determines a randomized visit schedule for the n targets; moving from one target to another takes unit time. The attacker then chooses the target to attack and the beginning and end of the attack. The defender's goal is to minimize the attacker's expected utility.

We show that the defender's problem gives rise to a natural type of sequence over n-element alphabets which we term K-quasi-regular. A sequence is K-quasi-regular if (1) the frequency of each symbol i matches a prescribed frequency p(i), and (2) for each symbol i, the ratio between the longest and shortest distance between consecutive occurrences of i is bounded by K. 1-quasi-regular sequences have each symbol i occurring exactly evenly spaced at distance 1/p(i).

We show that any random 2-quasi-regular sequence gives rise to an optimal schedule for the defender. We then show the existence of random 2-quasi-regular sequences via an efficient construction. Furthermore, we give a construction of 3-quasi-regular deterministic sequences; this construction leads to a 1.006-approximation for the defender. Finally, we show that if all p(i) are "small enough", then (1+ε)-quasi-regular deterministic sequences are guaranteed to exist.

• Ehsan Emamjomeh-Zadeh, David Kempe. Adaptive Hierarchical Clustering Using Ordinal Queries. In Proc. of SODA 2018, New Orleans, LA. Full version on arXiv.

We consider active learning algorithms for hierarchical clustering in an ordinal query model. In each iteration, the algorithm gets to present a triple of items to the user and ask which two are more similar than the third. The goal is to adaptively learn a ground truth hierarchy on all n elements with as few queries as possible. We consider a model in which the user's response is correct with probability p, and adversarially incorrect otherwise.

Our main result is a deterministic algorithm that learns the underlying hierarchical clustering using at most n log2(n) adaptive ordinal queries. In the model with incorrect answers, the algorithm outputs the correct hierarchical clustering with probability at least 1−δ, using O (n (logn+log(1/δ))) adaptive ordinal queries. For our results, adaptivity is crucial: we prove that even in the absence of noise, every non-adaptive algorithm requires Ω(n3) ordinal queries in the worst case.

• Ehsan Emamjomeh-Zadeh, David Kempe. A General Framework for Robust Interactive Learning. In Proc. of NIPS 2017, Long Beach, CA. Full version on arXiv.

We present a framework for learning models (such as classifiers, rankings, or clusterings) from equivalence queries. In a generalization of Angluin's equivalence query model, the algorithm repeatedly proposes such a model, and the user either confirms correctness or proposes a local improvement to the mode. The user's response is correct with probability p, and adversarially incorrect otherwise. The algorithm's goal is to learn the ground truth in few iterations, with high enough probability.

We show that it is sufficient that there be a (weighted, undirected) graph G whose nodes are the models, and whose edges correspond to the user feedback, with the following property: If s is the proposed model, and s* the ground truth, and the user's correct feedback corresponds to the edge (s,s'), then this edge is on a shortest path from s to s*. Under this assumption, we show that am algorithm reminiscent of the multiplicative weights update algorithm will identify the ground truth model in a number of queries that's logarithmic in the number of candidate models. From this general result, we obtain as easy corollary known and new algorithms for learning rankings, classifiers, and clusterings.

• Yu Cheng, Shaddin Dughmi, David Kempe. Of the People: Voting Is More Effective with Representative Candidates. In Proc. of EC 2017, Cambridge, MA. Full version on arXiv.

We consider a model of voting where candidates and voters are embedded in a joint metric space, and the cost of a candidate is the sum of distances to all voters. We are interested in the distortion of election rules, defined as the worst-case ratio between the cost of the chosen candidate and the cost of the optimal candidate. We specifically study this question when there are exactly two candidates, who are representative of the voters, in that they are chosen i.i.d. from the population of voters.

We show that when the metric space is a line, the worst-case expected distortion is at most 1.1716, and this bound is tight. For arbitrary metrics, we prove a lower bound of 1.5 and a constant upper bound slightly less than 2. In both cases, this improves on the (tight) bound of 3 when the candidates are arbitrary, and 2 when they are drawn i.i.d. from a distribution different than the voter distribution.

• Xinran He, Ke Xu, David Kempe, Yan Liu. Learning Influence Functions from Incomplete Observations. In Proc. of NIPS 2016, Barcelona, Spain. Full version on arXiv.

We study the problem of inferring influence functions from observed cascades when each activated node is missing independently at random. Specifically, we assume that several cascades are generated with different seed sets, and a randomly generated subset of the final set of activated nodes is observable. We give sample complexity bounds for proper and improper PAC learning under the Linear Threshold and Independent Cascade models. Experiments with the improper learning algorithm show that it can compensate fairly well for missing data, even when a large fraction of data are missing.

• Shaddin Dughmi, David Kempe, Ruixin Qiang. Persuasion with Limited Communication. In Proc. of EC 2016, Maastricht, Netherlands. Full version on arXiv.

We study a setting with a single buyer, a single item, and a single seller. The buyer's value for the item is drawn from a known distribution. A principal knows the buyer's value, and can send a signal to the seller, with the goal of maximizing social welfare of the buyer and seller combined. However, the principal is restricted in the number of distinct signals (or bits) he can send.

We show that even a single bit can lead to drastic welfare improvements: while the worst-case welfare loss without any signals can be Ω(n), it is O(log n) with just one bit (where n is the support size of the buyer's value distribution). This insight leads to a QPTAS for finding the welfare-maximizing signaling scheme with just M signals. By showing that a certain version of the objective function is submodular, we also obtain a genuine constant-factor approximation.

We also study the generalization to more general signaling problems. Here, we show that in general, no constant-factor approximation to the welfare-maximizing signaling scheme with M signals can be found in polynomial time unless P=NP. This is shown with a game in which the signal's receiver must guess a hyperedge in a graph (or an incident vertex).

• Xinran He, David Kempe. Robust Influence Maximization. In Proc. of KDD 2016, San Francisco, CA. Full version on arXiv.

We consider influence maximization in social networks in a model where the person intending to choose the set of seed nodes faces a lot of uncertainty about the network dynamics. This could be due to uncertainty about the governing model or its parameters, and will frequently result from noisy or incomplete data. We show that the objective function is a minimum of monotone submodular functions, and can thus be approximated in a bicriteria sense. We also show strong inapproximability results for both this general model and the special case in which for each edge, and interval is given, and an adversary chooses the edge's parameter from that interval.

• Ehsan Emamjomeh-Zadeh, David Kempe, Vikrant Singhal. Deterministic and Probabilistic Binary Search in Graphs. In Proc. of STOC 2016, Cambridge, MA. Full version on arXiv.

We generalize Binary Search from the line to arbitrary graphs in the following sense: one unknown node is the target, and the algorithm gets to make repeated queries to nodes to identify the target. In response to a query, unless the algorithm found the node, it receives an edge out of the queried node which lies on a shortest path from the queried node to the target. Our main result is that for arbitrary (weighted) undirected graphs, log_2(n) queries are always sufficient.

On the other hand, we show that for directed graphs (even without edge weights), identifying the minimum number of queries necessary to find a target in the worst case is PSPACE-complete. When each edge is part of a short cycle, though, the positive result from undirected graphs declines gracefully.

When queries are answered correctly only with probability p, and adversarially incorrectly otherwise, we show that a Multiplicative Weights algorithm inspired by finding maximum-likelihood nodes for the target finds succeeds with probability 1-δ in O(log n) rounds with an information-theoretically optimal constant depending on δ and p.

• Li Han, David Kempe, Ruixin Qiang: Incentivizing Exploration with Heterogeneous Value of Money. In Proc. of WINE 2015, Amsterdam, Netherlands, pages 370-383. Full version on arXiv

We study a multi-armed bandit model in which a principal can offer agents money in return for pulling specific arms; agents who are not offered (enough) money will pull the myopically optimal arm. We consider the generalization in which agents have different utility functions for money and quasi-linear utilities overall. The principal is partially informed about agents' tradeoffs via a signal that could be more or less accurate. The principal's goal is to maximize the infinite time-horizon time-discounted reward of all arm pulls, minus the time-discounted payments.

In this setting, we show that a convex program that depends only on the agents' distribution of tradeoffs and the signaling process (but not on the MAB instance) can be used to prescribe a payment strategy for the principal to follow. We show that this strategy is optimal (in a worst-case sense over MAB instances), by giving a separate convex program that characterizes the worst-case instance, and showing that both have the same objective value at optimum.

• Vasilis Syrgkanis, David Kempe, Eva Tardos: Information Asymmetries in Common-Value Auctions with Discrete Signals. Presented at EC 2015, Portland, OR [Full version available on SSRN].

We study first-price and hybrid common-value auctions between two bidders who receive asymmetric discrete signals about the item's value. Our main technical result is a complete characterization of the unique mixed Nash Equilibrium of the auction via a recurrence. We use this characterization to further explore the outcome of common-value auctions, observing the following: (1) We can explicitly characterize the limit equilibrium of the second-price auction. Revenue in the limit does not collapse. (2) The Linkage Principle fails to hold in asymmetric common-value auctions. (3) There is a complete revenue ranking of the equilibria of common-value hybrid auctions according to the probability of paying one's own bid. (4) Additional information that can be acquired in the form of another independent signal can exhibit surpising non-monotonicity properties.

• Peter Frazier, David Kempe, Jon Kleinberg, Robert Kleinberg. Incentivizing Exploration. In Proc. of EC 2014, Stanford, CA. Recipient of Best Paper Award. Conference version as PDF.

We consider a Bayesian Multi-Armed Bandit setting in which the principal cannot pull arms himself, but has to rely on selfish and myopic agents to do so. While the principal wants to maximize the time-discounted total reward of all agents, each agent only cares about his immediate reward. To incentivize exploration, the principal can offer payments to the agents for pulling particular arms. We show that when offering a b fraction of the optimum reward as total time-discounted payments, the principal can attain a 1-(c^1/2-b^1/2)^2 fraction of the optimum reward, where c is the time discount factor. This is tight.

• Xinran He, David Kempe. Stability of Influence Maximization. Unpublished Manuscript Erratum on arXiv.

In KDD 2014, we published a paper with the same name, whose main result was unfortunately incorrect. An erratum, including the problem setup, a hardness result replacing our algorithmic result, and the experiments, is posted on arXiv as given above. Please do not reference the KDD 2014 paper, and instead use the version on arXiv, which will remain an unpublished manuscript.

• David Kempe, Brendan Lucier: User Satisfaction in Competitive Sponsored Search. In Proc. of WWW 2014, Seoul, Korea. Extended version on arXiv.

We study a game between multiple search engines, each of which wants to attract (and possibly satisfy) searchers. Our goal is to investigate how the objective function of the search engine (display vs. clickthrough advertising) affects the probability that a searcher is satisfied. Our result is a dichotomy: if search engines are interested in satisfying users (e.g., due to clickthrough ads), or users have convex response functions to search result quality, then we expect to see specialization of results, and the price of anarchy is 2, meaning that users are satisfied almost as often as at the social optimum. On the other hand, when engines are interested mostly in attracting users (e.g., display ads), and users don't have convex response functions, the price of anarchy can be very high.

• Po-An Chen, Bart de Keijzer, David Kempe, Guido Schaefer: Altruism and Its Impact on the Price of Anarchy. In ACM Transactions on Economics and Computation 2/4, article 17, 2014.

This paper is an expanded and unified merge of the EC 2008 and WINE 2011 papers below. It presents a natural model of altruism via a linear term in each player's utility that depends on the overall welfare. Under this model, we analyze the price of anarchy and stability of several standard games; in particular, atomic and non-atomic congestion games, but also valid utility games and cost-sharing games. Much of the work is accomplished via a natural generalization of Roughgarden's smoothness framework to partially altruistic games.

• Xinran He, David Kempe: Price of Anarchy for the N-player Competitive Cascade Game with Submodular Activation Functions. In Proc. of WINE 2013, Cambridge, MA. Conference version as PDF.

We study competitive influence maximization in a social network in a model proposed by Goyal and Kearns. Individuals choose whether to adopt any of the competing products based on the total number of neighbors having adopted any of the products, then decide which one proportionally to the fraction of that product among their neighbors. Each individual company wants to maximize its own number of followers. We give a short and elegant proof that the coarse Price of Anarchy is 2 for an arbitrary number of companies. This improves on a PoA of 4 shown by Goyal and Kearns, and is also much simpler by drawing on several prior results in the area.

• Ken Wilbur, Linli Xu, David Kempe: Correcting Audience Externalities in Television Advertising. Marketing Science 32/6, pages 892-912, November/December 2013. Full version on SSRN (nearly final).

We study the problem of selecing, pricing, and ordering ads for a television commerical break. The main concerns are that (1) television ads differ in the amount of switching they cause, and (2) the audience is heterogeneous in their switching behavior and value to advertisers. As a result, different advertisers should face different prices and positions in the break. We present a heuristic based on Dynamic Programming for optimizing the selection and ordering; it is then evaluated on a large-scale real-world data set. The analysis of the data set itself provides a contribution, in that it adds significant evidence for audience heterogeneity.

• Michal Feldman, David Kempe, Brendan Lucier, Renato Paes Leme: Pricing Public Goods for Private Sale. In Proc. of EC 2013, Philadelphia, PA. Extended version on arXiv.

We study the following question: a seller wants to price a good for sale. The good has the property that once a person purchases it, all of the person's friends can enjoy it as much as if they themselves had purchased it. We study equilibria of this game (under a Bayesian model for the utilities of agents), and the corresponding optimizing prices for the seller. When the graph is complete, we show that there is a uniform price to set such that the revenue in the corresponding worst equilibrium is within a constant factor of the best possible revenue at the best equilibrium under non-uniform pricing. For regular graphs, we obtain a weaker version of this guarantee.

• David Kempe, Jon Kleinberg, Sigal Oren, Aleksandrs Slivkins: Selection and Influence in Cultural Dynamics. In Proc. of EC 2013, Philadelphia, PA. Extended version on arXiv.

We study a model combining selection and influence as factors in the adoption of opinions. A continuum of individuals occupy the nodes of a graph, representing the diversity of opinions (or languages, religions, etc.). Individuals interact randomly with others, and can be swayed by them if they are neighbors in the graph, in which case they move to the neighbor's node. Our goal is to analyze the convergence properties and equilibrium characterization. For the model in which interactions are equally likely between all pairs of individuals, we give an essentially complete characterization. When individuals only ever interact with neighbors, we make partial progress in proving convergence and the structure of stable equilibria.

• Ittai Abraham, Shiri Chechik, David Kempe, Aleksandrs Slivkins: Low-Distortion Inference of Latent Similarities from a Multiplex Social Network. In SIAM Journal on Computing 44/3, 07/2015, pp. 617-668. A preliminary version was presented at SODA 2013, New Orleans, LA. Extended version on arXiv.

We posit a model wherein a social network is generated randomly from multiple underlying similarity spaces. Each space gives rise to an edge set, with edges between close individuals more likely than between distant ones. An algorithm observes the unlabeled union of all edges, and tries to infer the underlying similarity spaces. We show that efficient (near-linear time) algorithms can infer the spaces with low distortion under some "independence" assumptions.

• Bo An, David Kempe, Christopher Kiekintveld, Eric Shieh, Satinder Singh, Milind Tambe, Yevgeniy Vorobeychik. Security games with limited surveillance. In Proc. of AAAI 2012. Conference version as PDF.

We consider Stackelberg games in which a defender commits to a mixed strategy, but this strategy is not fully observable to an attacker. Instead, the attacker draws a (known) number of samples and decides on an action based on the (Bayesian inferred) posterior distribution. We study strategies for choosing the right first-mover strategy, and evaluate the gain compared to pessimistic assumptions.

• Po-An Chen, Bart de Keijzer, David Kempe, Guido Schaefer: The Robust Price of Anarchy of Altruistic Games. In Proc. of WINE 2011, Singapore. Extended version on arXiv.

We extend Roughgarden's notion of smoothness of a game to games in which players are partially altruistic. Using this extended smoothness framework, we show (tight) bounds on the Robust Price of Anarchy of several atomic games. In particular, for valid utility games, regardless of the altruism level, the bound remains at 2, while the worst-case bound gets worse for cost-sharing games and atomic congestion games. On the other hand, for symmetric singleton congestion games, the bound improves. We also investigate the case of a mixture of completely altruistic and completely selfish players in symmetric singleton congestion games.

• Abhimanyu Das, David Kempe: Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection. In Proc. of ICML 2011, Seattle, WA. Distinguished Paper. Extended version on arXiv.

We define a natural notion quantifying how "close to submodular" a function is. We prove that the approximation guarantees of greedy algorithms degrade smoothly in this parameter. We then use it to analyze frequently used greedy algorithms for subset selection in regression, and extend the analysis to dictionary selection, obtaining the best known constant factors for both.

• Mahyar Salek, Shahin Shayandeh, David Kempe: You Share, I Share: Network Effects and Economic Incentives in P2P File-Sharing Systems. In Proc. of WINE 2010, Stanford, CA. Extended version on arXiv.

We study P2P systems from a game-theoretic standpoint. Individuals have a cost for sharing items, which increases as the demand for their items increases. A central authority can reward individuals for sharing, with money or recognition. The more others share an item, the less demand there is for one individual's copy, leading to positive network externalities on sharing. We show that the system has "diminishing returns" for payments, meaning that approximately optimal solutions can be found easily. As part of the analysis, we give a precise characterization for the class of random processes on graphs for which diminishing returns can be shown via equialence to reachability in random graph models.

• David Kempe, Mahyar Salek, Cristopher Moore: Frugal and Truthful Auctions for Vertex Covers, Flows, and Cuts. In Proc. of FOCS 2010, Las Vegas, NV. Extended version on arXiv.

We study auctions for purchasing a feasible solution of a coverage-type problem, while paying "as little as possible", compared to a natural benchmark. Specifically, we focus on Vertex Cover auctions (where we obtain auctions within a constant factor of best possible), and use these as a tool to design (optimal) auctions for flows and cuts in graphs.

• Abhimanyu Das, David Kempe: Estimating the Average of a Lipschitz-Continuous Function from One Sample. In Proc. of ESA 2010, Liverpool, UK. Extended version on arXiv.

We study the problem of estimating the average of a Lipschitz continuous function over a metric space, by sampling the function only at a single location. The function is assumed to be chosen adversarially, and the algorithm gets to randomize the location at which it samples the function. We give a PTAS for metric spaces of bounded doubling dimension, and precisely calculate the optimum distribution and error when the metric space is the interval [0,1].

• Jason Tsai, Zhengyu Yin, Jun-Young Kwak, David Kempe, Christopher Kiekintveld, Milind Tambe: Urban Security: Game-Theoretic Resource Allocation in Networked Physical Domains. In Proc. of AAAI 2010, Atlanta. [Formats: PDF]

We study a game wherein targets are embedded in a graph. An attacker enters from a source and would like to reach one of multiple targets of possibly different values. A defender must first commit to a (randomized) strategy of placing k checkpoints on edges. The defender's goal is to minimize the expected damage done by the attacker. We present efficient algorithms to generate distributions that are optimal under certain assumptions, and also test them on instances derived from maps of Mumbai, India.

• Po-An Chen, Mary David, David Kempe: Better Vaccination Strategies for Better People. In Proc. of EC 2010, Boston. [Formats: PDF]

We study vaccinations of individuals in a model wherein a disease will break out at a random node of a network, and spread through all non-vaccinated nodes. We first give an O(log n) approximation for the goal of minimizing the total societal cost (consisting of vaccinations and sickness). We then analyze the impact of partially selfish (but slightly altruistic) behavior on the part of the individuals, and show that if individuals are slightly altruistic and can only opt out of vaccinations, then decisions to opt out will only decrease social welfare by a constant factor depending on the altruism level.

• David Kempe, Ahuva Mu'Alem, Mahyar Salek: Envy Free Allocations for Budgeted Bidders. In Proc. of WINE 2009, Rome, Italy. [Formats: PDF]

We study the problem of allocating items to bidders who are not only constrained by their valuations for items, but also by their budgets, i.e., ability to pay. We present efficient algorithms for finding maximal and minimal price vectors supporting a given allocation, and also show a structural condition guaranteeing minimal price vectors for different allocations to be the same.

• Po-An Chen, David Kempe: Bayesian Auctions with Friends and Foes. In Proc. of SAGT 2009, Paphos, Cyprus.

Since the publication of this result, we realized that the main theorem about first-price auctions is unfortunately, and embarassingly, incorrect. The corollaries are therefore also incorrect. Please do not cite, use, or reference this paper.

• Gaurav Agarwal, David Kempe: Modularity-Maximizing Graph Communities via Mathematical Programming. In European Physics Journal B, Vol 66/3, 12/2008, [Formats: PDF]

All of the code for running the algorithms can be downloaded as a ZIP archive. A brief explanation on the sequence of steps and how to use the code is in our README file.

We present two new algorithms for finding a graph clustering approximately maximizing the modularity measure (introduced by Newman). Our algorithms are based on rounding fractional LP and VP solutions, and are thus the first algorithms for this problem which provide a posteriori error guarantees. We evaluate the algorithms on several real-world and synthetic networks.

• Mahyar Salek, David Kempe: Auctions for Share-Averse Bidders. In Proceedings of WINE 2008, Shanghai, China. [Formats: PDF]

We study auctions in which the item(s) being auctioned can be shared among multiple winners, but the valuation of winners decreases as more and more others share with them. We derive the optimal truthful auction for a single item in the sense of Meyerson, and observe several properties depending on the rate at which utility decreases. We also examine share-averse combinatorial auctions with single-minded bidders, prove sufficient conditions for incentive compatibility, and give an essentially tight approximation algorithm.

• David Kempe, Mohammad Mahdian: A Cascade Model for Externalities in Sponsored Search. In Proceedings of WINE 2008, Shanghai, China. [Formats: PDF]

We present a natural model of user behavior in scanning and clicking ads in sponsored search. Our model captures the externalities that previous ads impose on later ones by taking away users. It is based on independent random click and continuation choices in each slot. We give a polynomial-time algorithm for allocation and pricing in the most basic model, and PTAS resp. QPTAS for extensions involving multiple ad slates and slot specific click parameters.

• Moshe Babaioff, Nicole Immorlica, David Kempe, Robert Kleinberg: Online Auctions and Generalized Secretary Problems. In SIGecom Exchanges 7(2), June 2008. [Formats: PDF]

We review several recent results on generalized secretary problems, in which multiple elements need to be selected from a randomly ordered input sequence subject to combinatorial constraints. The goal is to maximize the sum of values of the selected elements.

• Po-An Chen, David Kempe: Altruism, Selfishness, and Spite in Traffic Routing. In Proceedings of EC 2008, Chicago, IL. [Formats: PDF]

We study traffic routing under the assumption that the users are "partially altruistic", meaning that while they prefer to choose faster routes for themselves, they trade off their own gain with the congestion they impose on others. We show that even a small amount of (uniform) altruism reduces the price of anarchy to a constant. In parallel link networks, even a small average level of altruism has the same effect.

• Abhimanyu Das, David Kempe: Algorithms for Subset Selection in Linear Regression. In Proceedings of STOC 2008, Victoria, BC. [Formats: PDF]

We study the problem of selecting a subset of random variables to sample to minimize the prediction error of the resulting linear regression, given all covariances of pairs of random variables. Using perturbation analysis, dynamic programming, and other optimization techniques, we provide exact and approximation algorithms under restrictions on the structure of the covariance matrix. We also give sufficient conditions for the frequently used Forward Regression to produce near-optimal results or a constant-factor approximation.

• Abhimanyu Das, David Kempe: Sensor Selection for Minimizing Worst-Case Prediction Error. In Proceedings of IPSN 2008, St. Louis, MO. [Formats: PDF]

We study the problem of selecting a set of sensors to sample so as to minimize the worst-case prediction error of an aggregation function, when the actual sensor readings are chosen by an adversary subject to a hard metric constraint. We prove that the selection problem for estimating the average is equivalent to k-median, and for the maximum/minimum to k-center. We generalize these results to other aggregation functions, and also validate our algorithms on real-world sensor measurements.

• Bruce Kapron, David Kempe, Valerie King, Jared Saia, Vishal Sanwalani: Fast Asynchronous Byzantine Agreement and Leader Election with Full Information. In Proceedings of SODA 2008, San Francisco, CA. [Formats: PDF]

We give the first subexponential time protocols for Byzantine Agreement and Leader Election in the asynchronous full-information model. This resolves a long-standing open problem. In fact, our protocols take poly-logarithmic time, succeeding with high (Agreement) or constant (Election) probability. The key technique is a novel protocol for asynchronous universe reduction, based on an extension of Feige's protocol.

• Atsushi Iwasaki, David Kempe, Yasumasa Saito, Mahyar Salek, Makoto Yokoo: False-name-proof mechanisms for hiring a team. In Proceedings of WINE 2007, San Diego, CA. [Formats: PDF]

We study false-name proof auctions in which an auctioneer is trying to hire a team of elements owned by individually paid agents. The goal is to design incentive-compatible mechanisms which will make agents reveal both ownership of elements and true costs, while minimizing total payments. We formally present two natural models. For one, we give an O(n2n) competitive mechanisms, and prove a lower bound of Ω(2n). For the second, we present a mechanism with a reserve price, which does not always purchase a solution.

• Shishir Bharathi, David Kempe, Mahyar Salek: Competitive Influence Maximization in Social Networks. In Proceedings of WINE 2007, San Diego, CA. [Formats: PDF]

We study viral marketing in a scenario of multiple competing companies. We give a 1-1/e approximation for the last mover, prove a price of anarchy of 2, give first-mover algorithms for some very restricted graph structure, and present an FPTAS for the one-company version on bidirected trees.

• Moshe Babaioff, Nicole Immorlica, David Kempe, Robert Kleinberg: A Knapsack Secretary Problem with Applications. In Proceedings of APPROX 2007, Princeton, NJ. [Formats: PDF]

We study a Knapsack Secretary problem, where items with weights and values (chosen adversarially) arrive in a random order, and an algorithm should select a set of maximum value subject to a hard weight constraint, in an online fashion. We give a randomized constant-factor online algorithm for this problem, and improve the constant to 1/e if the items have uniform weights.

• Chayant Tantipathananandh, Tanya Berger-Wolf, David Kempe: A Framework for Community Identification in Dynamic Social Networks. In Proceedings of KDD 2007, San Jose, CA. [Formats: PDF]

We propose an optimization-based framework for identifying communities in a social network whose membership and connections change over time. Our framework is based on natural observations that individuals do not change group memberships frequently, and do not alternate between many groups. We propose optimal algorithms and heuristics, and evaluate them on several real-world data sets.

• David Kempe, Adam Meyerson, Nainesh Solanki, Ramnath Chellappa: Pricing Partially Compatible Products. In Proceedings of ECOM 2007, San Diego, CA. [Formats: PDF]

We study the problem of pricing individual components of a system when components from rival companies may vary in their compatibility. Such incompatibilities can be exploited by charging higher prices for inferior products, due to higher compatibility. We give a polynomial-time algorithm for the best-response pricing problem, and hardness results and approximation algorithms for the problems of optimal product improvement and compatibility alterations.

• Omid Madani, Wiley Greiner, David Kempe, Mohammad Salavatipour: Recall Systems: Efficient Learning and Use of Category Indices. In Proceedings of AISTATS 2007, San Juan, Puerto Rico. [Formats: PDF]

We propose "Recall Systems" as an efficient preprocessing step for learning category indices. The idea is to use simple feature-based classifiers to reduce the number of remaining candidates to a managable set before applying traditional classifiers. We prove hardness results for finding optimal categories, give heuristics, and evaluate then on large data sets in practice.

• Michael Collins, David Kempe, Jared Saia, Maxwell Young: Nonnegative Integral Subset Representation of Integer Sets. In Information Processing Letters, Vol 101/3, 02/2007, pp. 129-133. [Formats: PDF]

Motivated by applications in radiation therapy, we study the problem of representing a set of integers by a "smaller" set. By this, we mean that each element of the given set is the sum of a subset of elements from the representing set. We prove that finding the best representing set is NP-complete, investigate lower bounds, and evaluate heuristics experimentally.

• Fang Bian, David Kempe, Ramesh Govindan: Utility-based Sensor Selection. In Proceedings of IPSN 2006, San Antonio, TX. [Formats: PDF]

Sensor network applications face a tradeoff between the amount of data that can be retrieved and the lifetime of the network. Many recent papers have addressed this tradeoff with ad hoc approaches for maximizing lifetime. Here, we argue that a more formal approach is useful: to maximize the total utility that the network can generate with its measurements. The utility function is user-specified and application-sepecific. For several classes of natural utility functions, we give hardness results, polynomial-time algorithms, or approximation algorithms.

• Anna Karlin, David Kempe, Tami Tamir: Beyond VCG: Frugality of Truthful Mechanisms. In Proceedings of FOCS 2005, Pittsburgh, PA. [Formats: Postscript or PDF]

We study the problem of hiring a team of agents. Each agent submits a bid, and a feasible set of agents is to be hired and paid. We propose a new and intuitive measure of the frugality of a mechanism in this setting, and give competitive mechanisms for path auctions and a class of set systems termed k-out-of-r systems.

• Ara Hayrapetyan, David Kempe, Martin Pál, Zoya Svitkina: Unbalanced Graph Cuts. In Proceedings of ESA 2005, Mallorca, Spain. [Formats: Postscript or PDF]

We propose the study of unbalanced graph cuts: cuts with a given capacity bound on edges, leaving as few nodes on the s-side of the cut as possible. The problem has applications in the containment of disasters and epidemics, but also in finding dense communities in social networks or the web graph. We present a 2/2 bicriteria approximation, and a polynomial time algorithm for graphs of bounded treewidth.

• Xiaoming Zheng, Sonal Jain, Sven Koenig, David Kempe: Multi-Robot Forest Coverage. In Proceedings of IROS 2005, Edmonton, Canada. [Formats: Postscript or PDF]

We study the problem of visiting all nodes of a graph with multiple robots. After proving several versions of the problem NP-hard, we experimentally evaluate an approximation algorithm of Even et al., and show that it significantly outperforms techniques currently used in robotics.

• David Kempe, Jon Kleinberg, Éva Tardos: Influential Nodes in a Diffusion Model for Social Networks. Merged into the journal version of "Maximizing the Spread of Influence through a Social Networks" (below). A preliminary version appeared in Proceedings of ICALP 2005. [ Full version at ToC homepage.]

An extension of the paper "Maximizing the Spread of Influence in a Social Network", this paper studies a more general diffusion model termed "Decreasing Cascade Model." We prove that the greedy algorithm is a (1-1/e) approximation in that model, using more involved techniques than in the previous paper.

• Michail Lagoudakis, Vangelis Markakis, David Kempe, Pinar Keskinocak, Sven Koenig, Anton Kleywegt, Craig Tovey, Adam Meyerson, Sonal Jain: Auction-Based Multi-Robot Routing. In Proceedings of Robotics 2005, Boston, MA. [Formats: Postscript or PDF]

We study simple bidding-based heuristics for assigning targets to multiple robots, leading to greedy algorithms that can be easily implemented in a decentralized way. We prove upper and lower bounds on the performance of these heuristics for three objectives: total energy, completion time, and average latency.

• Dimitris Achlioptas, Aaron Clauset, David Kempe, Cris Moore.
On the Bias of Traceroute Sampling, or: Power-law Degree Distributions in Regular Graphs.
In Proceedings of STOC 2005, Baltimore, MD. [Formats: Postscript or PDF]

We study the degree distribution of BFS trees in random graphs, and give a characterization of the observed degrees as a function of the actual ones. In particular, we observe a power law with exponent -1 in regular random graphs. As BFS trees are a good model for single-source all-destinations traceroutes, this analytically characterizes the bias of traceroute sampling, observed empirically by Lakhina et al.

• Venkat Guruswami, Jason Hartline, Anna Karlin, David Kempe, Claire Kenyon, Frank McSherry:
On Profit-Maximizing Envy-free Pricing.
In Proceedings of SODA 2005, Vancouver, BC. [Formats: Postscript or PDF]

We study the problem of pricing items for profit maximization in a combinatorial setting when the desired bundles and price caps of all customers are known. In particular, we study the special case of single-minded bidders, who are only interested in one bundle if they can afford it. Even this case is APX-hard, so we study approximation algorithms as well as further special cases, including customers interested in paths to a tree's root.

• David Kempe, Frank McSherry: A Decentralized Algorithm for Spectral Analaysis.
In Journal of Computer and System Sciences, Vol. 74/1, 02/2008, pp.70-83. A preliminary version appeard in Proceedings of STOC 2004, Chicago, IL. [Full version in Postscript or PDF]

We present and analyze a simple decentralized algorithm for computing the top k eigenvectors of a symmetric matrix, when the nodes of a network know only the matrix entries corresponding to their rows. By utilizing the nodes in the network for the computation, the algorithm manages to be exponentially faster than centralized algorithms, and at the same time avoids the problem of having to collect the data.

• Leonid Meyerguz, David Kempe, Jon Kleinberg, Ron Elber: The Evolutionary Capacity of Protein Structures.
In Proceedings of RECOMB 2004, San Diego, CA. [Formats: Postscript or PDF]

We define the evolutionary capacity of a protein sequence as the number of sequences that have lower energy in the shape of the given protein sequence. Thus, the evolutionary capacity captures how much room for improvement exists over the native sequences. We present and analyze an approximation algorithm based on random sampling.

• David Kempe, Alin Dobra, Johannes Gehrke: Gossip-Based Computation of Aggregate Information.
In Proceedings of FOCS 2003, Boston. [Formats: Postscript or PDF]

We present and analyze simple algorithms for decentralized protocols to compute aggregate functions, such as sums, averages, random samples, quantiles, etc. We show that when using Uniform Gossip for communication, the protocols converge to the true answer exponentially fast. We give a general framework that separates the details of the protocol from the communication mechanism, and analyze their mutual influence. In particular, we also analyze the convergence speed for flooding.

• David Kempe, Jon Kleinberg, Éva Tardos: Maximizing the Spread of Influence in a Social Network.
In Theory of Computing 11(4), 2015. A preliminary version appeared in KDD 2003, and received the Best Paper Award there. [ Full version at ToC homepage.]

We study the problem of how to spread an innovation in a network, such as a social network. We consider several standard models from the sociology literature, and provide a constant-factor approximation algorithm for the problem of how to spend a given marketing budget to reach as large a fraction of the network as possible in expectation.

• David Kempe, Jon Kleinberg: Protocols and Impossibility Results for Gossip-Based Communication Mechanisms.
In Proceedings of FOCS 2002, Vancouver. [Formats: Postscript, PDF]

This paper studies how the choice of gossip distributions affects which problems can be solved efficiently with small messages in a distributed setting. In particular, it presents a simple approximation algorithm for finding a minimum spanning tree using spatial gossip (see below), and shows that using uniform gossip, neither the MST problem nor the resource location problem (see below) can be approximated fast and with small messages.

• Elliot Anshelevich, David Kempe, Jon Kleinberg: Stability of Load Balancing Algorithms in Dynamic Adversarial Systems.
Full version to appear in SIAM Journal of Computing. A preliminary version appeared in Proceedings of STOC 2002, Montréal.
[Full version in Postscript or PDF.]

We study the load balancing and packet routing problems in a dynamic online setting, where an adversary can insert and remove jobs or packets. We give simple proofs of stability for a natural distributed algorithm against rate-1 adversaries, for load balancing with up to 2 commodities, and packet routing with a single sink. A main contribution of the paper is a new proof technique for stability proofs.

• Leonard Adleman, Qi Cheng, Ashish Goel, Ming-Deh Huang, David Kempe, Pablo Moisset de Espanés, Paul Rothemund: Combinatorial Optimization Problems in Self-Assembly.
In Proceedings of STOC 2002, Montréal. Invited to a special issue of Journal of Computer and System Sciences devoted to selected papers from the conference. [Formats: Postscript, PDF].

Self-assembly is the process by which small simple objects (tiles) autonomously aggregate into larger and more difficult complexes. We study the question of designing systems of a small number of distinct tiles that assemble into a desired structure, and of choosing tile concentrations to achieve fast assembly. Among others, we show that it is NP-complete to find the smallest tile system producing uniquely a desired shape. The algorithm that proves membership in NP is of interest in its own right as a verification procedure.

• David Kempe, Jon Kleinberg, Alan Demers: Spatial Gossip and Resource Location Protocols.
Journal of the ACM, Vol. 51/6, 11/2004, pp.943-967. A preliminary version appeared in the Proceedings of STOC 2001, Crete, Greece.
Full version in Postscript or PDF.

This paper analyzes a class of gossip distributions termed "spatial gossip", in which the probability of communication between two nodes depends inversely polynomially on their distance. It proves that information travels distance d in time poly-logarithmic in d, with high probability. Spatial Gossip is then used to design very simple protocols for the "Resource Location Problem", in which each node is supposed to be informed the closest copy of some desirable resource.

• David Kempe, Jon Kleinberg, Amit Kumar: Connectivity and Inference Problems for Temporal Networks.
Journal of Computer and System Sciences 64 (2002). A preliminary version appeared in the Proceedings of STOC 2000, Portland.
Full version in Postscript or PDF.

We propose the study of "temporal networks", graphs with time labels on the edges, as a combinatorial structure to capture the dynamics of information dissemination. Time-respecting paths, those with increasing time labels, correspond naturally to the flow of information. Our main result is an excluded minor theorem, showing that the number of node-disjoint time respecting s-t paths is equal to the size of a minimum s-t cutset for all time labelings, if and only if the graph does not contain a certain 5-node graph as a topological minor.

• David Kempe: On the Complexity of the Reflections game.
Unpublished Manuscript. [Formats: Postscript, PDF]

The result of a few afternoons spent playing a puzzle game on the computer. In "Reflections", a player places optical objects (such as mirrors and items to split light beams) on a grid, trying to hit a given set of light bulbs. This paper proves that solving levels of the game is an NP-complete problem.

The publications listed below stem from my undergraduate work in theorem proving and algebraic specification, which I did at the Universität Karlsruhe. They include my "Diplomarbeit" (roughly corresponding to a master's thesis).

• Arno Schönegge, David Kempe: On the weakness of conditional equations in algebraic specification (Postscript).
In Proceedings of DMTCS 1999, Auckland, New Zealand.

We prove that there are recursive, monomorphic abstract data types which can be specified with quantifier-free first-order formulas under loose semantics but not with conditional equations under initial semantics (both not allowing auxiliary functions). See below for some context.

• David Kempe, Arno Schönegge: On the power of quantifiers in first-order algebraic specification (Postscript).
In Proceedings of CSL 1998, Brno, Czech republic.

We prove that there are recursive, monomorphic abstract data types specifiable with full first-order logic, but not without quantifiers (both under loose semantics, and without allowing auxiliary functions). While the result seems intuitively obvious, it is surprisingly non-trivial to prove. As with the paper above, this is part of a larger effort (Arno Schönegge's PhD thesis) to compare expressive power of different commonly used methods of specifying abstract data types.

• Diplomarbeit: Ausdrucksmächtigkeit von Quantoren in algebraischen Spezifikationen (Postscript).
(Expressive power of quantifiers in algebraic specification).
Thesis advisor: Arno Schönegge.
In German, 08/1998.

An extended version of the above paper, in German. The core idea is the same, but it is complemented by several related results, and gives more detail and definitions.

• Martin Giese, David Kempe, Arno Schönegge: KIV zur Verifikation von ASM-Spezifikationen am Beispiel der DLX-Pipelining Architektur (Postscript).
(Using KIV for verifying ASM specifications: Verification of the DLX pipelining architecture).
Technical Report 16/97, Fakultät für Informatik, Universität Karlsruhe, Germany.
In German.

A report (in German) on a case study which investigated how the KIV system could be used to verify the correctness of hardware architectures. It describes the formalization of correctness criteria (showing that the most basic version of the processor produces the same results as a pipelined one), and the techniques used in interactively proving the results.

Back to my home page