# Soheil Behnezhad

Hi, I am a Motwani Postdoctoral Fellow at the Computer Science Department of Stanford University.

I am broadly interested in theoretical computer science. Much of my work is on large-scale algorithms and revolves around massively parallel computation (à la MapReduce), graph sparsification, streaming algorithms, sublinear algorithms, and dynamic algorithms.

Before joining Stanford, I got my PhD in Computer Science from UMD where I was a Google PhD Fellow advised by MohammadTaghi Hajiaghayi. During my PhD, I spent a year at the Simons Institute of UC Berkeley, a summer at TTIC, and a summer at Google Research. I got my B.Sc. from Sharif University.

►

I am broadly interested in theoretical computer science. Much of my work is on large-scale algorithms and revolves around massively parallel computation (à la MapReduce), graph sparsification, streaming algorithms, sublinear algorithms, and dynamic algorithms.

Before joining Stanford, I got my PhD in Computer Science from UMD where I was a Google PhD Fellow advised by MohammadTaghi Hajiaghayi. During my PhD, I spent a year at the Simons Institute of UC Berkeley, a summer at TTIC, and a summer at Google Research. I got my B.Sc. from Sharif University.

►

**Potential students:**I will join the Computer Science department of Northeastern University as an Assistant Professor starting Fall 2022. Students with a strong background in CS/math that are interested in working with me can apply to Northeastern this fall and mention my name. Feel free to contact me directly if you have questions.
Publications

Modern Large-Scale Algorithms for Classical Graph Problems
(PhD Thesis)

Although computing power has advanced at an astonishing rate, it has been far outpaced by the growing scale of data. This has led to an abundance of algorithmic problems where the input tends to be, by orders of magnitude, larger than the memory available on a single machine. The challenges of data processing at this scale are inherently different from those of traditional algorithms. For instance, without having the whole input properly stored in the memory of a single machine, it is unrealistic to assume that any arbitrary location of the input can be accessed at the same cost; an assumption that is essential for traditional algorithms. In this thesis, we focus on modern computational models that capture these challenges more accurately, and devise new algorithms for several classical graph problems.

Specifically, we study models of computation that only allow the algorithm to use sublinear resources (such as time, space, or communication). Examples include (i) massively parallel computation algorithms where the workload is distributed among several machines each with sublinear space/communication, (ii) sublinear-time algorithms that take time sublinear in the input size, (iii) streaming algorithms that take only few passes over the input having access to a sublinear space, and (iv) dynamic algorithms that maintain a property of a dynamically changing input using a sublinear time per update.

We propose new algorithms for classical graph problems such as maximum/maximal matching, maximal independent set, minimum vertex cover, and graph connectivity in these models that substantially improve upon the state-of-the-art and are in many cases optimal. Many of our algorithms build on model-independent tools and ideas that are of independent interest and lead to improved bounds in more than one of the aforementioned settings.

Specifically, we study models of computation that only allow the algorithm to use sublinear resources (such as time, space, or communication). Examples include (i) massively parallel computation algorithms where the workload is distributed among several machines each with sublinear space/communication, (ii) sublinear-time algorithms that take time sublinear in the input size, (iii) streaming algorithms that take only few passes over the input having access to a sublinear space, and (iv) dynamic algorithms that maintain a property of a dynamically changing input using a sublinear time per update.

We propose new algorithms for classical graph problems such as maximum/maximal matching, maximal independent set, minimum vertex cover, and graph connectivity in these models that substantially improve upon the state-of-the-art and are in many cases optimal. Many of our algorithms build on model-independent tools and ideas that are of independent interest and lead to improved bounds in more than one of the aforementioned settings.

Time-Optimal Sublinear Algorithms for Matching and Vertex Cover
[
FOCS 2021
]

This paper gives a near-tight analysis of the average "query complexity" of the randomized greedy maximal matching algorithm.
This leads to a number of time-optimal algorithms for approximating the size of maximum matching and minimum vertex cover in sublinear time.

On the Robust Communication Complexity of Bipartite Matching
[
RANDOM 2021
]

We study the "robust communication complexity" of the maximum bipartite matching problem. The edges of an adversarially chosen n-vertex bipartite graph G are partitioned randomly between Alice and Bob. Alice has to send a single message to Bob, using which Bob has to output an approximate maximum matching of G. We present a new protocol obtaining a 0.716 approximation with O(n) communication, improving over the previous close-to-2/3 approximations.

Beating Two-Thirds For Random-Order Streaming Matching
[
ICALP 2021
]

We study matchings in the random-order streaming setting. We prove that for an absolute constant ε>0, one can find a (2/3+ε)-approximate maximum matching of any n-vertex graph using O(n log n) space with high probability. This breaks the natural boundary of 2/3 for this problem prevalent in the prior work and resolves an open problem of Bernstein [ICALP'20].

Stochastic Weighted Matching: (1 - ε) Approximation
[
FOCS 2020
]

In this paper, we settle the approximation factor of the stochastic weighted matching problem, showing a (1-ε) approximation is achievable for any constant ε > 0. Previously, only close to 0.5-approximations were known.

Stochastic Matching with Few Queries: (1 - ε) Approximation
[
STOC 2020
]

52

^{nd}Annual ACM Symposium on Theory of Computing
In this paper, we settle the approximation factor of the unweighted stochastic matching problem. We provide an analysis of a non-adaptive algorithm showing that it obtains a (1-ε) approximation for any constant ε > 0. The best known approximation prior to our work was 2/3.

Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice
[
VLDB 2020
]

In this paper, we propose theoretically improved algorithms for several important graph problems in the
Adaptive Massively Parallel Computations (AMPC) setting, which we introduced
in an earlier paper and also evaluate them empirically.

Fully Dynamic Matching: Beating 2-Approximation in ∆

^{ε}Update Time [ SODA 2020 ]
The 31

^{st}Annual ACM-SIAM Symposium on Discrete Algorithms
For fully dynamic graphs, we have algorithms that maintain a 2-approximation of maximum matching extremely fast. But all known algorithms for general graphs maintaining a better-than-2 approximate matching require a large polynomial update time. In this paper, we show that in this regime, the update time can be improved to any arbitrarily small polynomial. Namely, we give an algorithm that for any desirably small constant ε>0, takes only O(∆

^{ε}) update-time.
Fully Dynamic Maximal Independent Set with Polylogarithmic Update Time
[ FOCS 2019 ]

60th Annual IEEE Symposium on Foundations of Computer Science

This paper presents the first algorithm for maintaining a maximal independent set (and related problems) in fully dynamic graphs that takes polylogarithmic time per update. Previous algorithms required polynomial update time.

Exponentially Faster Massively Parallel Maximal Matching
[ FOCS 2019 ]

60th Annual IEEE Symposium on Foundations of Computer Science

This paper presents exponentially faster Massively Parallel Computation (MPC) algorithms for maximal matching. This is achieved by providing an analysis of a natural algorithm which was conjectured to work by Czumaj et al. [STOC 2018]. The analysis reveals surprising connections between MPC and the notion of

*query-complexity*developed for sublinear algorithms.
Near-Optimal Massively Parallel Graph Connectivity
[ FOCS 2019 ]

60th Annual IEEE Symposium on Foundations of Computer Science

This paper presents a strongly sublinear space Massively Parallel Computation algorithm for
graph connectivity which for graphs with diameter D > log

^{ε}n, takes O(log D) rounds and takes O(log log n) rounds on other graphs. This improves upon an algorithm of Andoni et al. [FOCS 2019] and almost settles the problem due to an Ω(log D) conditional lower bound.
Streaming and Massively Parallel Algorithms for Edge Coloring
[ ESA 2019 ]

27th Annual European Symposium on Algorithms

This paper initiates the study of edge coloring in the streaming model. In addition, an improved Massively Parallel Computation algorithm for edge coloring is presented.

Stochastic Matching on Uniformly Sparse Graphs
[ SAGT 2019 ]

12th International Symposium on Algorithmic Game Theory

Massively Parallel Computation of Matching and MIS in Sparse Graphs
[ PODC 2019 ]

ACM Symposium on Principles of Distributed Computing

This paper presents strongly sublinear space Massively Parallel Computation algorithms for
maximal matching and maximal independent set that for graphs with polylogarithmic arboricity (which includes most families of sparse graphs such as minor-free graphs and bounded-genus graphs) take O(log

^{2}log n) rounds. Previous algorithms required polylogarithmic rounds.
Optimal Strategies of Blotto Games: Beyond Convexity
[ EC 2019 ]

The 20

^{th}ACM Conference on Economics and Computation
This paper presents algorithms providing near-optimal strategies for the Colonel Blotto game that in addition have small support size. Previous algorithms produced strategies with large support. While bounding the support size makes the solution space non-convex (and thus prevents the use of convex programming), we show through a set of structural results that the space can be decomposed into polynomially many disjoint convex polytopes that can be solved independently.

Massively Parallel Computation via Remote Memory Access
[ SPAA 2019 ]

The 31

^{st}ACM Symposium on Parallelism in Algorithms and Architectures
Invited to TOPC, special issue for SPAA'19.

This paper introduces the

*Adaptive Massively Parallel Computations (AMPC)*model which is an extension of the Massively Parallel Computations (MPC) model. Compared to MPC, AMPC allows the machines to also adaptively query a read-only data store within a round, subject to the same communication bounds as in MPC. This feature is inspired by RDMA systems that are widely available. We give AMPC algorithms that are significantly faster than the state-of-the-art for MPC.
Stochastic Matching with Few Queries: New Algorithms and Tools
[ SODA 2019 ]

The 30

^{th}Annual ACM-SIAM Symposium on Discrete Algorithms
This paper presents improved non-adaptive algorithms for the stochastic matching problem. This is achieved by analysing a natural randomized algorithm that differs from those in the literature in a fundamental way. The improvement for unweighted graphs is significant: from the previous close-to-half approximations to 0.6568 approximation. For weighted graphs, the approximation factor is at least 0.501, which is the first to break half.

Almost Optimal Stochastic Weighted Matching With Few Queries
[ EC 2018 ]

The 19

^{th}ACM Conference on Economics and Computation
This paper presents improved stochastic matching algorithms for weighted graphs. The main result is an adaptive algorithm that obtains a (1-ε) approximation using a constant number of queries per vertex. Previous algorithms for weighted graphs required super-constant queries.

Spatio-Temporal Games Beyond One Dimension
[ EC 2018 ]

The 19

^{th}ACM Conference on Economics and Computation
This paper considers a generalization of spatio-temporal games from a one-dimensional space to graphs. An approximation algorithm (that relaxes the number of resources) is given for general graphs and a dependent-rounding technique is also shown to provide an algorithm for the one-dimensional case that simplifies the prior work.

Brief Announcement: MapReduce Algorithms on Massive Trees
[ ICALP 2018 ]

The 45

^{th}International Colloquium on Automata, Languages, and Programming
Winning Strategies of Blotto and Auditing Games
[ SODA 2018 ]

The 29

^{th}Annual ACM-SIAM Symposium on Discrete Algorithms
This paper introduces a notion of (u, p)-maxmin strategies which guarantee receiving a minimum utility of u with probability at least p. It is shown that (u, p)-maximin strategies of the Colonel Blotto and auditing games can be well-approximated in polynomial time.

Affinity Clustering: Hierarchical Clustering at Scale
[ NIPS 2017 ]

The 30

^{th}Annual Conference on Neural Information Processing Systems
This paper proposes

*affinity clustering*which is a hierarchical clustering algorithm based on Boruvka's minimum spanning tree algorithm. It is shown to be superior to many other clustering methods on a number of public and private data sets. In addition, the paper presents fast large-scale algorithms for affinity clustering.
A Polynomial Time Algorithm for Spatio-Temporal Games
[ EC 2017 ]

The 18

^{th}ACM Conference on Economics and Computation
This paper presents the first polynomial time algorithm for a well-studied variant of security games that is played out in space and time. Prior work obtained polynomial-time algorithms only for special cases of the problem e.g., when there are a constant number of timesteps.

Brief Announcement: Graph Matching in Massive Datasets
[ SPAA 2017 ]

The 29

^{th}ACM Symposium on Parallelism in Algorithms and Architectures
This paper presents a remarkably simple algorithm for approximate maximum matching of bipartite graphs in the streaming and Massively Parallel Computations models. The algorithm requires 1/ε rounds and O(n

^{1.5}) central space to achieve (1-ε) approximation.
Faster and Simpler Algorithm for Optimal Strategies of Blotto Game
[ AAAI 2017 ]

The 31

^{st}AAAI Conference on Artificial Intelligence
This paper presents a linear programming (LP) formulation for the well-studied Colonel Blotto game with polynomially many constraints. Prior work relied on the ellipsoid method to solve an exponential size LP. It is further shown that the number of constraints of the formulation is optimal.

Visits/Internships

Research intern at TTIC, Chicago — Summer 2020

Host: Avrim Blum

Research intern at Google Research, New York — Summer 2019

Hosts: Jakub Łącki, Vahab Mirrokni

Visiting graduate student at the Simons Institute, UC Berkeley — Fall 2018

Program: Foundations of Data Science

Research intern at Upwork, Mountain View — Summer 2018

Host: Nima Reyhani

Visiting graduate student at the Simons Institute, UC Berkeley — Spring 2018

Research intern at Upwork, Mountain View — Summer 2017

Host: Nima Reyhani

Contact

Email: soheil.behnezhad [at] gmail.com