# Soheil Behnezhad

Hi, I am a Motwani Postdoctoral Fellow at the Computer Science Department of Stanford University.

I am broadly interested in theoretical computer science. Much of my work (see my thesis) focuses on large-scale algorithms and revolves around massively parallel computation (à la MapReduce), graph sparsification, streaming algorithms, sublinear algorithms, and dynamic algorithms.

At Stanford, I am hosted by the amazing Moses Charikar, Aviad Rubinstein, Amin Saberi, and Li-Yang Tan. I got my Ph.D. from UMD where I was fortunate to be advised by MohammadTaghi Hajiaghayi. I have also spent two semesters at the Simons Institute, a summer at TTIC where I had the pleasure of working with Avrim Blum, and a summer at Google Research. I got my B.Sc. from Sharif University.

I am broadly interested in theoretical computer science. Much of my work (see my thesis) focuses on large-scale algorithms and revolves around massively parallel computation (à la MapReduce), graph sparsification, streaming algorithms, sublinear algorithms, and dynamic algorithms.

At Stanford, I am hosted by the amazing Moses Charikar, Aviad Rubinstein, Amin Saberi, and Li-Yang Tan. I got my Ph.D. from UMD where I was fortunate to be advised by MohammadTaghi Hajiaghayi. I have also spent two semesters at the Simons Institute, a summer at TTIC where I had the pleasure of working with Avrim Blum, and a summer at Google Research. I got my B.Sc. from Sharif University.

Recent Manuscripts

Almost 3-Approximate Correlation Clustering in Constant Rounds

We show that a (3+ε)-approximation of correlation clustering can be found in O(1/ε) rounds. This is a culminating point for the rich literature on correlation clustering as the approximation matches a natural bound for combinatorial algorithms and the round complexity is essentially constant.

Publications

New Trade-Offs for Fully Dynamic Matching via Hierarchical
EDCS
[
SODA 2022
]

We study the maximum matching problem in fully dynamic graphs where a graph is undergoing both edge insertions and deletions, and the goal is to efficiently maintain a large matching after each edge update. This problem has received considerable attention in recent years. The known algorithms naturally exhibit a trade-off between the quality of the matching maintained (i.e., the approximation ratio) and the time needed per update. While several interesting results have been obtained, the optimal behavior of this trade-off remains largely unclear. Our main contribution is a new approach to designing fully dynamic approximate matching algorithms that in a unified manner not only (essentially) recovers all previously known tradeoffs that were achieved via very different techniques, but reveals some new ones as well.

Stochastic Vertex Cover with Few Queries
[
SODA 2022
]

We study the problem of finding a minimum vertex cover (MVC) of a random subgraph of a given graph. The algorithm is unaware of this random subgraph
but can learn if an edge of the base graph exists in it by querying it. The goal is to find an approximate MVC by querying few edges. This stochastic setting has been studied extensively for various problems such as minimum spanning trees, matroids, shortest paths, and matchings. However, no non-trivial bound was known for MVC prior to our work. We show in this work that a constant number of queries per vertex suffice to obtain good approximations for the stochastic MVC problem.

Modern Large-Scale Algorithms for Classical Graph Problems
(PhD Thesis)

Charles A. Caramello Award for Best Thesis in Math, Physical Sciences, and Engineering at UMD, 2021.

Although computing power has advanced at an astonishing rate, it has been far outpaced by the growing scale of data. This has led to an abundance of algorithmic problems where the input tends to be, by orders of magnitude, larger than the memory available on a single machine. The challenges of data processing at this scale are inherently different from those of traditional algorithms. For instance, without having the whole input properly stored in the memory of a single machine, it is unrealistic to assume that any arbitrary location of the input can be accessed at the same cost; an assumption that is essential for traditional algorithms. In this thesis, we focus on modern computational models that capture these challenges more accurately, and devise new algorithms for several classical graph problems.

Specifically, we study models of computation that only allow the algorithm to use sublinear resources (such as time, space, or communication). Examples include (i) massively parallel computation algorithms where the workload is distributed among several machines each with sublinear space/communication, (ii) sublinear-time algorithms that take time sublinear in the input size, (iii) streaming algorithms that take only few passes over the input having access to a sublinear space, and (iv) dynamic algorithms that maintain a property of a dynamically changing input using a sublinear time per update.

We propose new algorithms for classical graph problems such as maximum/maximal matching, maximal independent set, minimum vertex cover, and graph connectivity in these models that substantially improve upon the state-of-the-art and are in many cases optimal. Many of our algorithms build on model-independent tools and ideas that are of independent interest and lead to improved bounds in more than one of the aforementioned settings.

Specifically, we study models of computation that only allow the algorithm to use sublinear resources (such as time, space, or communication). Examples include (i) massively parallel computation algorithms where the workload is distributed among several machines each with sublinear space/communication, (ii) sublinear-time algorithms that take time sublinear in the input size, (iii) streaming algorithms that take only few passes over the input having access to a sublinear space, and (iv) dynamic algorithms that maintain a property of a dynamically changing input using a sublinear time per update.

We propose new algorithms for classical graph problems such as maximum/maximal matching, maximal independent set, minimum vertex cover, and graph connectivity in these models that substantially improve upon the state-of-the-art and are in many cases optimal. Many of our algorithms build on model-independent tools and ideas that are of independent interest and lead to improved bounds in more than one of the aforementioned settings.

Time-Optimal Sublinear Algorithms for Matching and Vertex Cover
[
FOCS 2021
]

Selected for an invited talk at Highlights of Algorithms (HALG) 2022.

This paper gives a near-tight analysis of the average "query complexity" of the randomized greedy maximal matching algorithm.
This leads to a number of time-optimal algorithms for approximating the size of maximum matching and minimum vertex cover in sublinear time.

On the Robust Communication Complexity of Bipartite Matching
[
RANDOM 2021
]

We study the "robust communication complexity" of the maximum bipartite matching problem. The edges of an adversarially chosen n-vertex bipartite graph G are partitioned randomly between Alice and Bob. Alice has to send a single message to Bob, using which Bob has to output an approximate maximum matching of G. We present a new protocol obtaining a 0.716 approximation with O(n) communication, improving over the previous close-to-2/3 approximations.

Beating Two-Thirds For Random-Order Streaming Matching
[
ICALP 2021
]

We study matchings in the random-order streaming setting. We prove that for an absolute constant ε>0, one can find a (2/3+ε)-approximate maximum matching of any n-vertex graph using O(n log n) space with high probability. This breaks the natural boundary of 2/3 for this problem prevalent in the prior work and resolves an open problem of Bernstein [ICALP'20].

Stochastic Weighted Matching: (1 - ε) Approximation
[
FOCS 2020
]

In this paper, we settle the approximation factor of the stochastic weighted matching problem, showing a (1-ε) approximation is achievable for any constant ε > 0. Previously, only close to 0.5-approximations were known.

Stochastic Matching with Few Queries: (1 - ε) Approximation
[
STOC 2020
]

52

^{nd}Annual ACM Symposium on Theory of Computing
In this paper, we settle the approximation factor of the unweighted stochastic matching problem. We provide an analysis of a non-adaptive algorithm showing that it obtains a (1-ε) approximation for any constant ε > 0. The best known approximation prior to our work was 2/3.

Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice
[
VLDB 2020
]

In this paper, we propose theoretically improved algorithms for several important graph problems in the
Adaptive Massively Parallel Computations (AMPC) setting, which we introduced
in an earlier paper and also evaluate them empirically.

Fully Dynamic Matching: Beating 2-Approximation in ∆

^{ε}Update Time [ SODA 2020 ]
The 31

^{st}Annual ACM-SIAM Symposium on Discrete Algorithms
For fully dynamic graphs, we have algorithms that maintain a 2-approximation of maximum matching extremely fast. But all known algorithms for general graphs maintaining a better-than-2 approximate matching require a large polynomial update time. In this paper, we show that in this regime, the update time can be improved to any arbitrarily small polynomial. Namely, we give an algorithm that for any desirably small constant ε>0, takes only O(∆

^{ε}) update-time.
Exponentially Faster Massively Parallel Maximal Matching
[ FOCS 2019 ]

60th Annual IEEE Symposium on Foundations of Computer Science

This paper presents exponentially faster Massively Parallel Computation (MPC) algorithms for maximal matching. This is achieved by providing an analysis of a natural algorithm which was conjectured to work by Czumaj et al. [STOC 2018]. The analysis reveals surprising connections between MPC and the notion of

*query-complexity*developed for sublinear algorithms.
Fully Dynamic Maximal Independent Set with Polylogarithmic Update Time
[ FOCS 2019 ]

60th Annual IEEE Symposium on Foundations of Computer Science

This paper presents the first algorithm for maintaining a maximal independent set (and related problems) in fully dynamic graphs that takes polylogarithmic time per update. Previous algorithms required polynomial update time.

Near-Optimal Massively Parallel Graph Connectivity
[ FOCS 2019 ]

60th Annual IEEE Symposium on Foundations of Computer Science

This paper presents a strongly sublinear space Massively Parallel Computation algorithm for
graph connectivity which for graphs with diameter D > log

^{ε}n, takes O(log D) rounds and takes O(log log n) rounds on other graphs. This improves upon an algorithm of Andoni et al. [FOCS 2019] and almost settles the problem due to an Ω(log D) conditional lower bound.
Streaming and Massively Parallel Algorithms for Edge Coloring
[ ESA 2019 ]

27th Annual European Symposium on Algorithms

This paper initiates the study of edge coloring in the streaming model. In addition, an improved Massively Parallel Computation algorithm for edge coloring is presented.

Stochastic Matching on Uniformly Sparse Graphs
[ SAGT 2019 ]

12th International Symposium on Algorithmic Game Theory

Massively Parallel Computation of Matching and MIS in Sparse Graphs
[ PODC 2019 ]

ACM Symposium on Principles of Distributed Computing

This paper presents strongly sublinear space Massively Parallel Computation algorithms for
maximal matching and maximal independent set that for graphs with polylogarithmic arboricity (which includes most families of sparse graphs such as minor-free graphs and bounded-genus graphs) take O(log

^{2}log n) rounds. Previous algorithms required polylogarithmic rounds.
Optimal Strategies of Blotto Games: Beyond Convexity
[ EC 2019 ]

The 20

^{th}ACM Conference on Economics and Computation
This paper presents algorithms providing near-optimal strategies for the Colonel Blotto game that in addition have small support size. Previous algorithms produced strategies with large support. While bounding the support size makes the solution space non-convex (and thus prevents the use of convex programming), we show through a set of structural results that the space can be decomposed into polynomially many disjoint convex polytopes that can be solved independently.

The 31

^{st}ACM Symposium on Parallelism in Algorithms and Architectures
Invited to TOPC, special issue for SPAA'19.

This paper introduces the

*Adaptive Massively Parallel Computations (AMPC)*model which is an extension of the Massively Parallel Computations (MPC) model. Compared to MPC, AMPC allows the machines to also adaptively query a read-only data store within a round, subject to the same communication bounds as in MPC. This feature is inspired by RDMA systems that are widely available. We give AMPC algorithms that are significantly faster than the state-of-the-art for MPC.
Stochastic Matching with Few Queries: New Algorithms and Tools
[ SODA 2019 ]

The 30

^{th}Annual ACM-SIAM Symposium on Discrete Algorithms
This paper presents improved non-adaptive algorithms for the stochastic matching problem. This is achieved by analysing a natural randomized algorithm that differs from those in the literature in a fundamental way. The improvement for unweighted graphs is significant: from the previous close-to-half approximations to 0.6568 approximation. For weighted graphs, the approximation factor is at least 0.501, which is the first to break half.

Almost Optimal Stochastic Weighted Matching With Few Queries
[ EC 2018 ]

The 19

^{th}ACM Conference on Economics and Computation
This paper presents improved stochastic matching algorithms for weighted graphs. The main result is an adaptive algorithm that obtains a (1-ε) approximation using a constant number of queries per vertex. Previous algorithms for weighted graphs required super-constant queries.

Spatio-Temporal Games Beyond One Dimension
[ EC 2018 ]

The 19

^{th}ACM Conference on Economics and Computation
This paper considers a generalization of spatio-temporal games from a one-dimensional space to graphs. An approximation algorithm (that relaxes the number of resources) is given for general graphs and a dependent-rounding technique is also shown to provide an algorithm for the one-dimensional case that simplifies the prior work.

Brief Announcement: MapReduce Algorithms on Massive Trees
[ ICALP 2018 ]

The 45

^{th}International Colloquium on Automata, Languages, and Programming
Winning Strategies of Blotto and Auditing Games
[ SODA 2018 ]

The 29

^{th}Annual ACM-SIAM Symposium on Discrete Algorithms
This paper introduces a notion of (u, p)-maxmin strategies which guarantee receiving a minimum utility of u with probability at least p. It is shown that (u, p)-maximin strategies of the Colonel Blotto and auditing games can be well-approximated in polynomial time.

Affinity Clustering: Hierarchical Clustering at Scale
[ NIPS 2017 ]

The 30

^{th}Annual Conference on Neural Information Processing Systems
This paper proposes

*affinity clustering*which is a hierarchical clustering algorithm based on Boruvka's minimum spanning tree algorithm. It is shown to be superior to many other clustering methods on a number of public and private data sets. In addition, the paper presents fast large-scale algorithms for affinity clustering.
A Polynomial Time Algorithm for Spatio-Temporal Games
[ EC 2017 ]

The 18

^{th}ACM Conference on Economics and Computation
This paper presents the first polynomial time algorithm for a well-studied variant of security games that is played out in space and time. Prior work obtained polynomial-time algorithms only for special cases of the problem e.g., when there are a constant number of timesteps.

Brief Announcement: Graph Matching in Massive Datasets
[ SPAA 2017 ]

The 29

^{th}ACM Symposium on Parallelism in Algorithms and Architectures
This paper presents a remarkably simple algorithm for approximate maximum matching of bipartite graphs in the streaming and Massively Parallel Computations models. The algorithm requires 1/ε rounds and O(n

^{1.5}) central space to achieve (1-ε) approximation.
Faster and Simpler Algorithm for Optimal Strategies of Blotto Game
[ AAAI 2017 ]

The 31

^{st}AAAI Conference on Artificial Intelligence
This paper presents a linear programming (LP) formulation for the well-studied Colonel Blotto game with polynomially many constraints. Prior work relied on the ellipsoid method to solve an exponential size LP. It is further shown that the number of constraints of the formulation is optimal.

Contact

Email: soheil.behnezhad [at] gmail.com