Soheil Behnezhad
I am an Assistant Professor of Computer Science at Northeastern University.
I am broadly interested in theoretical computer science. Much of my work focuses on the theoretical foundations of big data algorithms. This includes sublinear time algorithms, parallel algorithms, streaming algorithms, dynamic algorithms, and graph sparsification.
Before joining Northeastern, I was a Motwani postdoc at Stanford. I got my PhD from UMD and my BSc from Sharif University.
I am broadly interested in theoretical computer science. Much of my work focuses on the theoretical foundations of big data algorithms. This includes sublinear time algorithms, parallel algorithms, streaming algorithms, dynamic algorithms, and graph sparsification.
Before joining Northeastern, I was a Motwani postdoc at Stanford. I got my PhD from UMD and my BSc from Sharif University.
Office: WVH 348
E-mail: s.behnezhad@northeastern.edu
Teaching
- (Fall '24) CS3000: Algorithms and Data
- (Spr '24) CS3000: Algorithms and Data
- (Fall '23) CS5800: Algorithms
- (Fall '22) CS7880: Special Topics in TCS — Algorithms for Big Data
Advising
- (PhD '22—) Amir Azarmehr
- (PhD '22—) Mohammad Saneian
- (PhD '23—) Alma Ghafari
Service
Misc
Recent Preprints
46
Vizing’s Theorem in Near-Linear Time
(Oct 2024)
Vizing's theorem from 1964 states that any graph of maximum degree ∆ admits a ∆+1 edge coloring. We present the first near-linear time algorithm that finds such a coloring.
45
Stochastic Matching via In-n-Out Local Computation Algorithms
(Nov 2024)
We prove that a polynomial degree (in inverse of realization probability) subgraph can obtain a (1-epsilon)-approximation for stochastic mtaching, improving over the previous
quasi-polynomial degree bound of [B., Derakhshan, Hajiaghayi STOC'20].
Publications
We settle the complexity of approximate maximum matching in dynamic streams. Our upper bound improves prior logarithmic rounds exponentially to O(log log n), and the lower bound matches this.
This paper presents improved upper and lower bounds for finding the minimum spanning tree (MST) of a general metric space.
We present improved bounds for (∆ + 1) vertex coloring fully dynamic graphs against adaptive adversaries.
We relate the complexity of maintaining an approximate maximum matching in a dynamic graph to an extremal combinatorics object that we call Oredered Ruzsa-Szemeredi (ORS) graphs.
We present a tight analysis of EDCS, which is a versatile matching sparsifier, in bipartite graphs. We show that, surprisingly, the EDCS obtains a strictly better than 2/3-approximation with certain parameters.
We present the first streaming algorithm that finds an O(∆) edge-coloring using subquadratic space.
We present improved sublinear time algorithms for estimating the cost of TSP in graphic and (1,2) metrics.
We prove that near-quadratic in n time is needed to approximate the size of maximum matching within an additive error of εn.
We present a (2-√2)-approximate algorithm for maximum matching in general graphs which has implications in dynamic and streaming settings. Our algorithm removes a large gap that previously existed between general and bipartite graphs.
We prove that any local computation algorithm (LCA) for computing a (1-ε) approximate maximum matching in graphs of maximum degree ∆ needs to spend at least ∆Ω(1/ε) time per query. This resolves a decade old open problem of the area.
We study the "robust communication complexity" of the maximum matching problem. The edges of an adversarially chosen n-vertex graph G are partitioned randomly between Alice and Bob. Alice has to send a single message to Bob, using which Bob has to output an approximate maximum matching of G. We present a new (tight) analysis of a known protocol by Bernstein, proving that it obtains a 5/6 approximation with O(n) communication, significantly improving prior approximations.
In this paper, we present the first super linear in n lower bound for approximating maximum matching size. We also present improved upper bounds.
This paper shows that Szemeredi's celebrated regularity lemma can be used to obtain non-trivial albeit slight improvements over longstanding barriers for matchings in streaming and dynamic graphs.
Best Paper Award at SODA '23.
Invited to TALG, special issue for SODA '23.
This paper improves two longstanding approximation/update-time trade-offs for maximum matching in fully dynamic graphs when the goal is to maintain just the size of the matching.
We give an algorithm that (1/2+Ω(1))-approximates the size of a maximum matching in O(n1+ε) time. No subquadratic time algorithm for beating 1/2-approximation was known prior to our work.
This paper presents improved algorithms for min-disagreement correlation clustering in a single pass of the streaming setting.
We show that a (3+ε)-approximation of correlation clustering can be found in O(1/ε) rounds. This is a culminating point for the rich literature on parallel correlation clustering as the approximation matches a natural bound and the round complexity is essentially constant.
We study the maximum matching problem in fully dynamic graphs where a graph is undergoing both edge insertions and deletions, and the goal is to efficiently maintain a large matching after each edge update. This problem has received considerable attention in recent years. The known algorithms naturally exhibit a trade-off between the quality of the matching maintained (i.e., the approximation ratio) and the time needed per update. While several interesting results have been obtained, the optimal behavior of this trade-off remains largely unclear. Our main contribution is a new approach to designing fully dynamic approximate matching algorithms that in a unified manner not only (essentially) recovers all previously known tradeoffs that were achieved via very different techniques, but reveals some new ones as well.
We study the problem of finding a minimum vertex cover (MVC) of a random subgraph of a given graph. The algorithm is unaware of this random subgraph
but can learn if an edge of the base graph exists in it by querying it. The goal is to find an approximate MVC by querying few edges. This stochastic setting has been studied extensively for various problems such as minimum spanning trees, matroids, shortest paths, and matchings. However, no non-trivial bound was known for MVC prior to our work. We show in this work that a constant number of queries per vertex suffice to obtain good approximations for the stochastic MVC problem.
25
Modern Large-Scale Algorithms for Classical Graph Problems
( PhD Thesis )
Although computing power has advanced at an astonishing rate, it has been far outpaced by the growing scale of data. This has led to an abundance of algorithmic problems where the input tends to be, by orders of magnitude, larger than the memory available on a single machine. The challenges of data processing at this scale are inherently different from those of traditional algorithms. For instance, without having the whole input properly stored in the memory of a single machine, it is unrealistic to assume that any arbitrary location of the input can be accessed at the same cost; an assumption that is essential for traditional algorithms. In this thesis, we focus on modern computational models that capture these challenges more accurately, and devise new algorithms for several classical graph problems.
Specifically, we study models of computation that only allow the algorithm to use sublinear resources (such as time, space, or communication). Examples include (i) massively parallel computation algorithms where the workload is distributed among several machines each with sublinear space/communication, (ii) sublinear-time algorithms that take time sublinear in the input size, (iii) streaming algorithms that take only few passes over the input having access to a sublinear space, and (iv) dynamic algorithms that maintain a property of a dynamically changing input using a sublinear time per update.
We propose new algorithms for classical graph problems such as maximum/maximal matching, maximal independent set, minimum vertex cover, and graph connectivity in these models that substantially improve upon the state-of-the-art and are in many cases optimal. Many of our algorithms build on model-independent tools and ideas that are of independent interest and lead to improved bounds in more than one of the aforementioned settings.
Specifically, we study models of computation that only allow the algorithm to use sublinear resources (such as time, space, or communication). Examples include (i) massively parallel computation algorithms where the workload is distributed among several machines each with sublinear space/communication, (ii) sublinear-time algorithms that take time sublinear in the input size, (iii) streaming algorithms that take only few passes over the input having access to a sublinear space, and (iv) dynamic algorithms that maintain a property of a dynamically changing input using a sublinear time per update.
We propose new algorithms for classical graph problems such as maximum/maximal matching, maximal independent set, minimum vertex cover, and graph connectivity in these models that substantially improve upon the state-of-the-art and are in many cases optimal. Many of our algorithms build on model-independent tools and ideas that are of independent interest and lead to improved bounds in more than one of the aforementioned settings.
This paper gives a near-tight analysis of the average "query complexity" of the randomized greedy maximal matching algorithm.
This leads to a number of time-optimal algorithms for approximating the size of maximum matching and minimum vertex cover in sublinear time.
We study the "robust communication complexity" of the maximum bipartite matching problem. The edges of an adversarially chosen n-vertex bipartite graph G are partitioned randomly between Alice and Bob. Alice has to send a single message to Bob, using which Bob has to output an approximate maximum matching of G. We present a new protocol obtaining a 0.716 approximation with O(n) communication, improving over the previous close-to-2/3 approximations.
We study matchings in the random-order streaming setting. We prove that for an absolute constant ε>0, one can find a (2/3+ε)-approximate maximum matching of any n-vertex graph using O(n log n) space with high probability. This breaks the natural boundary of 2/3 for this problem prevalent in the prior work and resolves an open problem of Bernstein [ICALP '20].
In this paper, we settle the approximation factor of the stochastic weighted matching problem, showing a (1-ε) approximation is achievable for any constant ε > 0. Previously, only close to 0.5-approximations were known.
52nd Annual ACM Symposium on Theory of Computing
In this paper, we settle the approximation factor of the unweighted stochastic matching problem. We provide an analysis of a non-adaptive algorithm showing that it obtains a (1-ε) approximation for any constant ε > 0. The best known approximation prior to our work was 2/3.
In this paper, we propose theoretically improved algorithms for several important graph problems in the
Adaptive Massively Parallel Computations (AMPC) setting, which we introduced
in an earlier paper and also evaluate them empirically.
The 31st Annual ACM-SIAM Symposium on Discrete Algorithms
For fully dynamic graphs, we have algorithms that maintain a 2-approximation of maximum matching extremely fast. But all known algorithms for general graphs maintaining a better-than-2 approximate matching require a large polynomial update time. In this paper, we show that in this regime, the update time can be improved to any arbitrarily small polynomial. Namely, we give an algorithm that for any desirably small constant ε>0, takes only O(∆ε) update-time.
60th Annual IEEE Symposium on Foundations of Computer Science
This paper presents exponentially faster Massively Parallel Computation (MPC) algorithms for maximal matching. This is achieved by providing an analysis of a natural algorithm which was conjectured to work by Czumaj et al. [STOC '18]. The analysis reveals surprising connections between MPC and the notion of query-complexity developed for sublinear algorithms.
60th Annual IEEE Symposium on Foundations of Computer Science
This paper presents the first algorithm for maintaining a maximal independent set (and related problems) in fully dynamic graphs that takes polylogarithmic time per update. Previous algorithms required polynomial update time.
60th Annual IEEE Symposium on Foundations of Computer Science
This paper presents a strongly sublinear space Massively Parallel Computation algorithm for
graph connectivity which for graphs with diameter D > logεn, takes O(log D) rounds and takes O(log log n) rounds on other graphs. This improves upon an algorithm of Andoni et al. [FOCS '19] and almost settles the problem due to an Ω(log D) conditional lower bound.
27th Annual European Symposium on Algorithms
This paper initiates the study of edge coloring in the streaming model. In addition, an improved Massively Parallel Computation algorithm for edge coloring is presented.
12th International Symposium on Algorithmic Game Theory
ACM Symposium on Principles of Distributed Computing
This paper presents strongly sublinear space Massively Parallel Computation algorithms for
maximal matching and maximal independent set that for graphs with polylogarithmic arboricity (which includes most families of sparse graphs such as minor-free graphs and bounded-genus graphs) take O(log2 log n) rounds. Previous algorithms required polylogarithmic rounds.
The 20th ACM Conference on Economics and Computation
This paper presents algorithms providing near-optimal strategies for the Colonel Blotto game that in addition have small support size. Previous algorithms produced strategies with large support. While bounding the support size makes the solution space non-convex (and thus prevents the use of convex programming), we show through a set of structural results that the space can be decomposed into polynomially many disjoint convex polytopes that can be solved independently.
The 31st ACM Symposium on Parallelism in Algorithms and Architectures
Invited to TOPC, special issue for SPAA '19.
This paper introduces the Adaptive Massively Parallel Computations (AMPC) model which is an extension of the Massively Parallel Computations (MPC) model. Compared to MPC, AMPC allows the machines to also adaptively query a read-only data store within a round, subject to the same communication bounds as in MPC. This feature is inspired by RDMA systems that are widely available. We give AMPC algorithms that are significantly faster than the state-of-the-art for MPC.
The 30th Annual ACM-SIAM Symposium on Discrete Algorithms
This paper presents improved non-adaptive algorithms for the stochastic matching problem. This is achieved by analysing a natural randomized algorithm that differs from those in the literature in a fundamental way. The improvement for unweighted graphs is significant: from the previous close-to-half approximations to 0.6568 approximation. For weighted graphs, the approximation factor is at least 0.501, which is the first to break half.
The 19th ACM Conference on Economics and Computation
This paper presents improved stochastic matching algorithms for weighted graphs. The main result is an adaptive algorithm that obtains a (1-ε) approximation using a constant number of queries per vertex. Previous algorithms for weighted graphs required super-constant queries.
The 19th ACM Conference on Economics and Computation
This paper considers a generalization of spatio-temporal games from a one-dimensional space to graphs. An approximation algorithm (that relaxes the number of resources) is given for general graphs and a dependent-rounding technique is also shown to provide an algorithm for the one-dimensional case that simplifies the prior work.
The 45th International Colloquium on Automata, Languages, and Programming
The 29th Annual ACM-SIAM Symposium on Discrete Algorithms
This paper introduces a notion of (u, p)-maxmin strategies which guarantee receiving a minimum utility of u with probability at least p. It is shown that (u, p)-maximin strategies of the Colonel Blotto and auditing games can be well-approximated in polynomial time.
The 30th Annual Conference on Neural Information Processing Systems
This paper proposes affinity clustering which is a hierarchical clustering algorithm based on Boruvka's minimum spanning tree algorithm. It is shown to be superior to many other clustering methods on a number of public and private data sets. In addition, the paper presents fast large-scale algorithms for affinity clustering.
The 18th ACM Conference on Economics and Computation
This paper presents the first polynomial time algorithm for a well-studied variant of security games that is played out in space and time. Prior work obtained polynomial-time algorithms only for special cases of the problem e.g., when there are a constant number of timesteps.
The 29th ACM Symposium on Parallelism in Algorithms and Architectures
This paper presents a remarkably simple algorithm for approximate maximum matching of bipartite graphs in the streaming and Massively Parallel Computations models. The algorithm requires 1/ε rounds and O(n1.5) central space to achieve (1-ε) approximation.
The 31st AAAI Conference on Artificial Intelligence
This paper presents a linear programming (LP) formulation for the well-studied Colonel Blotto game with polynomially many constraints. Prior work relied on the ellipsoid method to solve an exponential size LP. It is further shown that the number of constraints of the formulation is optimal.