Chao Xu | 许超

Photo of Chao Xu

I'm a research scientist at Yahoo! Research in New York. My research interests are algorithms, combinatorial optimization and computational geometry. Here is my CV and old research statement.

In May 2018, I finished my PhD in Computer Science from University of Illinois at Urbana-Champaign. My advisors were Karthik Chandrasekaran and Chandra Chekuri. I did my undergrad in mathematics at Stony Brook University.

I maintain a github account and a blog. I frequent cstheory.

You can contact me through email chao.xu@oath.com.

News: Our group is looking for 2019 summer interns. Theory CS PhD students welcome.
Recent Manuscripts * Some manuscripts are available upon request.

LP Relaxation and Tree Packing for Minimum $k$-cuts
(with Chandra Chekuri and Kent Quanrud)
2018, Submitted.

Karger used spanning tree packings to derive a near linear-time randomized algorithm for the global minimum cut problem as well as a bound on the number of approximate minimum cuts. This is a different approach from his well-known random contraction algorithm. Thorup developed a fast deterministic algorithm for the minimum $k$-cut problem via greedy recursive tree packings.
In this paper we revisit properties of an LP relaxation for $k$-cut proposed by Naor and Rabani, and analyzed by Chekuri, Guha and Naor. We show that the dual of the LP yields a tree packing, that when combined with an upper bound on the integrality gap for the LP, easily and transparently extends Karger's analysis for mincut to the $k$-cut problem. In addition to the simplicity of the algorithm and its analysis, this allows us to improve the running time of Thorup's algorithm by a factor of $n$. We also improve the bound on the number of $\alpha$-approximate $k$-cuts. Second, we give a simple proof that the integrality gap of the LP is $2(1-1/n)$. Third, we show that an optimum solution to the LP relaxation, for all values of $k$, is fully determined by the principal sequence of partitions of the input graph. This allows us to relate the LP relaxation to the Lagrangean relaxation approach of Barahona and Ravi and Sinha; it also shows that the idealized recursive tree packing considered by Thorup gives an optimum dual solution to the LP. This work arose from an effort to understand and simplify the results of Thorup.

Subset Sum Made Simple
(with Konstantinos Koiliaris)
2018, Submitted.

Subset Sum is a classical optimization problem taught to undergraduates as an example of an NP-hard problem, which is amenable to dynamic programming, yielding polynomial running time if the input numbers are relatively small. Formally, given a set $S$ of $n$ positive integers and a target integer $t$, the Subset Sum problem is to decide if there is a subset of $S$ that sums up to $t$. Dynamic programming yields an algorithm with running time $O(nt)$. Recently, the authors [Koiliaris & Xu SODA '17] improved the running time to $\tilde{O}\bigl(\sqrt{n}t\bigr)$, and it was further improved to $\tilde{O}\bigl(n+t\bigr)$ by a somewhat involved randomized algorithm by Bringmann [Bringmann SODA '17], where $\tilde{O}$ hides polylogarithmic factors.
Here, we present a new and significantly simpler algorithm with running time $\tilde{O}\bigl(\sqrt{n}t\bigr)$. While not the fastest, we believe the new algorithm and analysis are simple enough to be presented in an algorithms class, as a striking example of a divide-and-conquer algorithm that uses FFT to a problem that seems (at first) unrelated. In particular, the algorithm and its analysis can be described in full detail in two pages (see pages 3-5).

Minimum violation vertex maps and their applications to cut problems
(with Ken-ichi Kawarabayashi)
2018, Submitted.

A Polynomial Time Algorithm to Minimize Total Travel Time in $k$-Depot Storage/Retrieval System
(with Amir Gharehgozli and Wenda Zhang)
2018, Submitted.

A near-linear time algorithm for computing the optimal landing times of a fixed sequence of planes
(with Bin Cao)
2018, Submitted.

Conference Publications

Hypergraph $k$-Cut in Randomized Polynomial Time
(with Karthekeyan Chandrasekaran and Xilin Yu)
SODA 2018.

In the hypergraph $k$-cut problem, the input is a hypergraph, and the goal is to find a smallest subset of hyperedges whose removal ensures that the remaining hypergraph has at least $k$ connected components. This problem is known to be at least as hard as the densest $k$-subgraph problem when k is part of the input (Chekuri-Li, 2015). We present a randomized polynomial time algorithm to solve the hypergraph $k$-cut problem for constant $k$.

Our algorithm solves the more general hedge $k$-cut problem when the subgraph induced by every hedge has a constant number of connected components. In the hedge $k$-cut problem, the input is a hedgegraph specified by a vertex set and a disjoint set of hedges, where each hedge is a subset of edges defined over the vertices. The goal is to find a smallest subset of hedges whose removal ensures that the number of connected components in the remaining underlying (multi-)graph is at least $k$.

Our algorithm is based on random contractions akin to Karger's min cut algorithm. Our main technical contribution is a distribution over the hedges (hyperedges) so that random contraction of hedges (hyperedges) chosen from the distribution succeeds in returning an optimum solution with large probability.

Note: Journal version under submission.

Global and fixed-terminal cuts in digraphs
(with Kristóf Bérczi, Karthekeyan Chandrasekaran, Tamás Király and Euiwoong Lee)
APPROX 2017.
Note: The results on global bicut was serialized as the Beating the 2-approximation factor for global bicut in Mathematical Programming.

The computational complexity of multicut-like problems may vary significantly depending on whether the terminals are fixed or not. In this work we present a comprehensive study of this phenomenon in two types of cut problems in directed graphs: double cut and bicut.
  1. The fixed-terminal edge-weighted double cut is known to be solvable efficiently. We show a tight approximability factor of $2$ for the fixed-terminal node-weighted double cut. We show that the global node-weighted double cut cannot be approximated to a factor smaller than $\frac{3}{2}$ under the Unique Games Conjecture (UGC).
  2. The fixed-terminal edge-weighted bicut is known to have a tight approximability factor of $2$. We show that the global edge-weighted bicut is approximable to a factor strictly better than $2$, and that the global node-weighted bicut cannot be approximated to a factor smaller than $\frac{3}{2}$ under UGC.
  3. In relation to these investigations, we also prove two results on undirected graphs which are of independent interest. First, we show NP-completeness and a tight inapproximability bound of $\frac{4}{3}$ for the node-weighted $3$-cut problem. Second, we show that for constant $k$, there exists an efficient algorithm to solve the minimum $\{s,t\}$-separating $k$-cut problem.
Our techniques for the algorithms are combinatorial, based on LPs and based on enumeration of approximate min-cuts. Our hardness results are based on combinatorial reductions and integrality gap instances.
This image was used to prove some bold claim...

A Faster Pseudopolynomial Time Algorithm for Subset Sum
(with Konstantinos Koiliaris)
SODA 2017.
Note: Not practical. Discussion/implementation.

Given a multiset $S$ of $n$ positive integers and a target integer $t$, the subset sum problem is to decide if there is a subset of $S$ that sums up to $t$. We present a new divide-and-conquer algorithm that computes all the realizable subset sums up to an integer $u$ in $\tilde{O}\left(\min\{n\sqrt{u},u^{4/3},\sigma\}\right)$, where $\sigma$ is the sum of all elements in $S$ and $\tilde{O}$ hides polylogarithmic factors. This result improves upon the standard dynamic programming algorithm that runs in $O(nu)$ time. To the best of our knowledge, the new algorithm is the fastest general algorithm for this problem. We also present a modified algorithm for cyclic groups, which computes all the realizable subset sums within the group in $\tilde{O}\left(\min\{n\sqrt{m},m^{5/4}\}\right)$ time, where m is the order of the group.
Covering $\mathbb{Z}^*_{11}$ with segments of length $3$.
Note: Journal version under submission.

Computing minimum cuts in hypergraphs
(with Chandra Chekuri)
SODA 2017.
Note: The SODA camera ready version has a bug in the sparsification section that is fixed in the arXiv update.

We study algorithmic and structural aspects of connectivity in hypergraphs. Given a hypergraph $H=(V,E)$ with $n=|V|$, $m=|E|$ and $p=\sum_{e\in E}|e|$ the best known algorithm to compute a global minimum cut in $H$ runs in time $O(np)$ for the uncapacitated case and in $O(np+n^2\log n)$ time for the capacitated case. We show the following new results.
  1. Given an uncapacitated hypergraph $H$ and an integer $k$ we describe an algorithm that runs in $O(p)$ time to find a subhypergraph $H'$ with sum of degrees $O(kn)$ that preserves all edge-connectivities up to $k$ (a $k$-sparsifier). This generalizes the corresponding result of Nagamochi and Ibaraki from graphs to hypergraphs. Using this sparsification we obtain an $O(p+\lambda n^2)$ time algorithm for computing a global minimum cut of $H$ where $\lambda$ is the minimum cut value.
  2. We generalize Matula's argument for graphs to hypergraphs and obtain a $(2+\epsilon)$-approximation to the global minimum cut in a capacitated hypergraph in $O(\frac{1}{\epsilon}(p+n \log n)\log n)$ time.
  3. We show that a hypercactus representation of all the global minimum cuts of a capacitated hypergraph can be computed in $O(np+n^2\log n)$ time and $O(p)$ space.
We utilize vertex ordering based ideas to obtain our results. Unlike graphs we observe that there are several different orderings for hypergraphs which yield different insights.
Example of hyperedge insertion operation on vertex $v$.
Note: Journal version under submission.

On Element-Connectivity Preserving Graph Simplification
(with Chandra Chekuri and Thapanapong Rukkanchanunt)
ESA 2015.

The notion of element-connectivity has found several important applications in network design and routing problems. We focus on a reduction step that preserves the element-connectivity, which when applied repeatedly allows one to reduce the original graph to a simpler one. This pre-processing step is a crucial ingredient in several applications. In this paper we revisit this reduction step and provide a new proof via the use of setpairs. Our main contribution is algorithmic results for several basic problems on element-connectivity including the problem of achieving the aforementioned graph simplification. We utilize the underlying submodularity properties of element-connectivity to derive faster algorithms.
The black vertices are the terminals. The left image shows $4$ element-disjoint $st$-paths. The right image shows removing $4$ elements disconnects $s$ and $t$. $\kappa(s,t) = 4$.

Detecting Weakly Simple Polygons
(with Jeff Erickson and Hsien-Chih Chang)
SODA 2015.

A closed curve in the plane is weakly simple if it is the limit (in the Fréchet metric) of a sequence of simple closed curves. We describe an algorithm to determine whether a closed walk of length n in a simple plane graph is weakly simple in $O(n \log n)$ time, improving an earlier $O(n^3)$-time algorithm of Cortese et al.. As an immediate corollary, we obtain the first efficient algorithm to determine whether an arbitrary n-vertex polygon is weakly simple; our algorithm runs in $O(n^2 \log n)$ time. We also describe algorithms that detect weak simplicity in $O(n \log n)$ time for two interesting classes of polygons. Finally, we discuss subtle errors in several previously published definitions of weak simplicity.

Dedicated with thanks to our colleague Ferran Hurtado (1951–2014).

A polygon $(a, b, c, a, b, c, a, x, y, z, x, y, z, x)$ that is not weakly simple, even though its rotation number is $1$ and every pair of vertices splits the polygon into two paths that do not cross.
Journal Publications

The shortest kinship description problem
(with Qian Zhang)
Information Processing Letters, 138 (2018), pp. 61-66.

We consider a problem in descriptive kinship systems, namely finding the shortest sequence of terms that describes the kinship between a person and his/her relatives. The problem reduces to finding the minimum weight path in a labeled graph where the label of the path comes from a regular language. The running time of the algorithm is $O(n^3+s)$, where $n$ and $s$ are the input size and the output size of the algorithm, respectively.

To the memories of Jiaqi Zhao(1994–2016).

Beating the 2-approximation factor for global bicut
(with Kristóf Bérczi, Karthekeyan Chandrasekaran, Tamás Király and Euiwoong Lee)
Mathematical Programming, (2018).
Note: The preliminary version Global and fixed-terminal cuts in digraphs appeared in APPROX 2017.

In the fixed-terminal bicut problem, the input is a directed graph with two specified nodes s and t and the goal is to find a smallest subset of edges whose removal ensures that s cannot reach t and t cannot reach s. In the global bicut problem, the input is a directed graph and the goal is to find a smallest subset of edges whose removal ensures that there exist two nodes s and t such that s cannot reach t and t cannot reach s. Fixed-terminal bicut and global bicut are natural extensions of $\{s,t\}$-min cut and global min-cut respectively, from undirected graphs to directed graphs. Fixed-terminal bicut is NP-hard, admits a simple $2$-approximation, and does not admit a $(2−\epsilon)$-approximation for any constant $\epsilon>0$ assuming the unique games conjecture. In this work, we show that global bicut admits a $(2−1/448)$-approximation, thus improving on the approximability of the global variant in comparison to the fixed-terminal variant.

Reconstructing edge-disjoint paths faster
Operations Research Letters, 44 (2) (2016), pp. 174-176.

For a simple undirected graph with $n$ vertices and $m$ edges, we consider a data structure that given a query of a pair of vertices $u$, $v$ and an integer $k\geq 1$, it returns $k$ edge-disjoint $uv$-paths. The data structure takes $\tilde{O}(n^{3.375})$ time to build, using $O(mn^{1.5}\log n)$ space, and each query takes $O(kn)$ time, which is optimal and beats the previous query time of $O(kn\alpha(n))$.

Champion spiders in the game of Graph Nim
(with Neil J. Calkin and Janine E. Janoski and Allison Nelson and Sydney Ryan)
Congr. Numer., 218:5-19, 2013.

In the game of Graph Nim, players take turns removing one or more edges incident to a chosen vertex in a graph. The player that removes the last edge in the graph wins. A spider graph is a champion if it has a Sprague-Grundy number equal to the number of edges in the graph. We investigate the the Sprague-Grundy numbers of various spider graphs when the number of paths or length of paths increase.
A spider graph.
Manuscripts

A note on approximate strengths of edges in a hypergraph
(with Chandra Chekuri)
2017

Let $H=(V,E)$ be an edge-weighted hypergraph of rank $r$. Kogan and Krauthgamer extended Benczúr and Karger's random sampling scheme for cut sparsification from graphs to hypergraphs. The sampling requires an algorithm for computing the approximate strengths of edges. In this note we extend the algorithm for graphs to hypergraphs and describe a near-linear time algorithm to compute approximate strengths of edges; we build on a sparsification result for hypergraphs from our recent work. Combined with prior results we obtain faster algorithms for finding $(1+\epsilon)$-approximate mincuts when the rank of the hypergraph is small.

Marking Streets to Improve Parking Density
(with Steven Skiena)
2015

Street parking spots for automobiles are a scarce commodity in most urban environments. The heterogeneity of car sizes makes it inefficient to rigidly define fixed-sized spots. Instead, unmarked streets in cities like New York leave placement decisions to individual drivers, who have no direct incentive to maximize street utilization. In this paper, we explore the effectiveness of two different behavioral interventions designed to encourage better parking, namely (1) educational campaigns to encourage parkers to "kiss the bumper" and reduce the distance between themselves and their neighbors, or (2) painting appropriately-spaced markings on the street and urging drivers to "hit the line". Through analysis and simulation, we establish that the greatest densities are achieved when lines are painted to create spots roughly twice the length of average-sized cars. Kiss-the-bumper campaigns are in principle more effective than hit-the-line for equal degrees of compliance, although we believe that the visual cues of painted lines induce better parking behavior.
Representative street landscapes for random/Rényi (top), kiss-the-bumper (center), and hit-the-line (bottom) for a street of length $l = 20$ with the optimal line spacing of $k = 2$, for $α$ values of 0, 0.5, and 0.5 respectively.
Thesis

Cuts and Connectivity in Graphs and Hypergraphs
2018

In this thesis, we consider cut and connectivity problems on graphs, digraphs, hypergraphs and hedgegraphs. The main results are the following:
  • We introduce a faster algorithm for finding the reduced graph in element-connectivity computations. We also show its application to node separation.
  • We present several results on hypergraph cuts, including (a) a near linear time algorithm for finding a (2 + ε)-approximate min-cut, (b) an algorithm to find a representation of all min-cuts in the same time as finding a single min-cut, (c) a sparse subgraph that preserves connectivity for hypergraphs and (d) a near linear-time hypergraph cut sparsifier.
  • We design the first randomized polynomial time algorithm for the hypergraph $k$-cut problem whose complexity has been open for over 20 years. The algorithm generalizes to hedgegraphs with constant span.
  • We address the complexity gap between global vs. fixed-terminal cuts problems in digraphs by presenting a $2-\frac{1}{448}$ approximation algorithm for the global bicut problem.

Co-advised by Karthik Chandrasekaran and Chandra Chekuri.