# Chao Xu | 许(xǔ)超(chāo)

I'm a research scientist at Yahoo! Research in New York. My research interests are algorithms, combinatorial optimization and computational geometry. I'm also interested in more applied problems with nice theoretical components. Here is my CV(last updated Oct 2018) and old research statement.

I obtained my PhD in Computer Science from University of Illinois at Urbana-Champaign in May 2018. My advisors were Karthik Chandrasekaran and Chandra Chekuri. I've previously held a summer visiting position at National Institute of Informatics, hosted by Ken-ichi Kawarabayashi, and New York University, hosted by Boris Aronov. I finished my undergrad in mathematics at Stony Brook University in 2013, and spent Spring 2012 in Budapest Semesters in Mathematics.

I maintain a github account and a blog. I frequent cstheory.

You can contact me through email chao.xu@verizonmedia.com.

Conference Publications
LP Relaxation and Tree Packing for Minimum $k$-cuts

SOSA .

Karger used spanning tree packings to derive a near linear-time randomized algorithm for the global minimum cut problem as well as a bound on the number of approximate minimum cuts. This is a different approach from his well-known random contraction algorithm. Thorup developed a fast deterministic algorithm for the minimum $k$-cut problem via greedy recursive tree packings. In this paper we revisit properties of an LP relaxation for $k$-cut proposed by Naor and Rabani, and analyzed by Chekuri, Guha and Naor. We show that the dual of the LP yields a tree packing, that when combined with an upper bound on the integrality gap for the LP, easily and transparently extends Karger's analysis for mincut to the $k$-cut problem. In addition to the simplicity of the algorithm and its analysis, this allows us to improve the running time of Thorup's algorithm by a factor of $n$. We also improve the bound on the number of $\alpha$-approximate $k$-cuts. Second, we give a simple proof that the integrality gap of the LP is $2(1-1/n)$. Third, we show that an optimum solution to the LP relaxation, for all values of $k$, is fully determined by the principal sequence of partitions of the input graph. This allows us to relate the LP relaxation to the Lagrangean relaxation approach of Barahona and Ravi and Sinha; it also shows that the idealized recursive tree packing considered by Thorup gives an optimum dual solution to the LP. This work arose from an effort to understand and simplify the results of Thorup.

Hypergraph $k$-Cut in Randomized Polynomial Time

SODA .

In the hypergraph $k$-cut problem, the input is a hypergraph, and the goal is to find a smallest subset of hyperedges whose removal ensures that the remaining hypergraph has at least $k$ connected components. This problem is known to be at least as hard as the densest $k$-subgraph problem when k is part of the input (Chekuri-Li, 2015). We present a randomized polynomial time algorithm to solve the hypergraph $k$-cut problem for constant $k$. Our algorithm solves the more general hedge $k$-cut problem when the subgraph induced by every hedge has a constant number of connected components. In the hedge $k$-cut problem, the input is a hedgegraph specified by a vertex set and a disjoint set of hedges, where each hedge is a subset of edges defined over the vertices. The goal is to find a smallest subset of hedges whose removal ensures that the number of connected components in the remaining underlying (multi-)graph is at least $k$. Our algorithm is based on random contractions akin to Karger's min cut algorithm. Our main technical contribution is a distribution over the hedges (hyperedges) so that random contraction of hedges (hyperedges) chosen from the distribution succeeds in returning an optimum solution with large probability.

Global and fixed-terminal cuts in digraphs

APPROX .

The computational complexity of multicut-like problems may vary significantly depending on whether the terminals are fixed or not. In this work we present a comprehensive study of this phenomenon in two types of cut problems in directed graphs: double cut and bicut.

• The fixed-terminal edge-weighted double cut is known to be solvable efficiently. We show a tight approximability factor of $2$ for the fixed-terminal node-weighted double cut. We show that the global node-weighted double cut cannot be approximated to a factor smaller than $\frac{3}{2}$ under the Unique Games Conjecture (UGC).
• The fixed-terminal edge-weighted bicut is known to have a tight approximability factor of $2$. We show that the global edge-weighted bicut is approximable to a factor strictly better than $2$, and that the global node-weighted bicut cannot be approximated to a factor smaller than $\frac{3}{2}$ under UGC.
• In relation to these investigations, we also prove two results on undirected graphs which are of independent interest. First, we show NP-completeness and a tight inapproximability bound of $\frac{4}{3}$ for the node-weighted $3$-cut problem. Second, we show that for constant $k$, there exists an efficient algorithm to solve the minimum $\{s,t\}$-separating $k$-cut problem.

Our techniques for the algorithms are combinatorial, based on LPs and based on enumeration of approximate min-cuts. Our hardness results are based on combinatorial reductions and integrality gap instances.

A Faster Pseudopolynomial Time Algorithm for Subset Sum

SODA .

Given a multiset $S$ of $n$ positive integers and a target integer $t$, the subset sum problem is to decide if there is a subset of $S$ that sums up to $t$. We present a new divide-and-conquer algorithm that computes all the realizable subset sums up to an integer $u$ in $\tilde{O}\left(\min\{n\sqrt{u},u^{4/3},\sigma\}\right)$, where $\sigma$ is the sum of all elements in $S$ and $\tilde{O}$ hides polylogarithmic factors. This result improves upon the standard dynamic programming algorithm that runs in $O(nu)$ time. To the best of our knowledge, the new algorithm is the fastest general algorithm for this problem. We also present a modified algorithm for cyclic groups, which computes all the realizable subset sums within the group in $\tilde{O}\left(\min\{n\sqrt{m},m^{5/4}\}\right)$ time, where m is the order of the group.

Computing minimum cuts in hypergraphs

SODA .

We study algorithmic and structural aspects of connectivity in hypergraphs. Given a hypergraph $H=(V,E)$ with $n=|V|$, $m=|E|$ and $p=\sum_{e\in E}|e|$ the best known algorithm to compute a global minimum cut in $H$ runs in time $O(np)$ for the uncapacitated case and in $O(np+n^2\log n)$ time for the capacitated case. We show the following new results.

1. Given an uncapacitated hypergraph $H$ and an integer $k$ we describe an algorithm that runs in $O(p)$ time to find a subhypergraph $H'$ with sum of degrees $O(kn)$ that preserves all edge-connectivities up to $k$ (a $k$-sparsifier). This generalizes the corresponding result of Nagamochi and Ibaraki from graphs to hypergraphs. Using this sparsification we obtain an $O(p+\lambda n^2)$ time algorithm for computing a global minimum cut of $H$ where $\lambda$ is the minimum cut value.
2. We generalize Matula's argument for graphs to hypergraphs and obtain a $(2+\e)$-approximation to the global minimum cut in a capacitated hypergraph in $O(\frac{1}{\e}(p+n \log n)\log n)$ time.
3. We show that a hypercactus representation of all the global minimum cuts of a capacitated hypergraph can be computed in $O(np+n^2\log n)$ time and $O(p)$ space. We utilize vertex ordering based ideas to obtain our results. Unlike graphs we observe that there are several different orderings for hypergraphs which yield different insights.
On Element-Connectivity Preserving Graph Simplification

ESA .

The notion of element-connectivity has found several important applications in network design and routing problems. We focus on a reduction step that preserves the element-connectivity, which when applied repeatedly allows one to reduce the original graph to a simpler one. This pre-processing step is a crucial ingredient in several applications. In this paper we revisit this reduction step and provide a new proof via the use of setpairs. Our main contribution is algorithmic results for several basic problems on element-connectivity including the problem of achieving the aforementioned graph simplification. We utilize the underlying submodularity properties of element-connectivity to derive faster algorithms.

Detecting Weakly Simple Polygons

SODA .

A closed curve in the plane is weakly simple if it is the limit (in the Fréchet metric) of a sequence of simple closed curves. We describe an algorithm to determine whether a closed walk of length n in a simple plane graph is weakly simple in $O(n \log n)$ time, improving an earlier $O(n^3)$-time algorithm of Cortese et al.. As an immediate corollary, we obtain the first efficient algorithm to determine whether an arbitrary n-vertex polygon is weakly simple; our algorithm runs in $O(n^2 \log n)$ time. We also describe algorithms that detect weak simplicity in $O(n \log n)$ time for two interesting classes of polygons. Finally, we discuss subtle errors in several previously published definitions of weak simplicity.

Journal Publications
description airplay
Minimum cuts and sparsification in hypergraphs

SICOMP .

We study algorithmic and structural aspects of connectivity in hypergraphs. Given a hypergraph $H=(V,E)$ with $n = |V|$, $m = |E|$ and $p = \sum_{e \in E} |e|$ the fastest known algorithm to compute a global minimum cut in $H$ runs in $O(np)$ time for the uncapacitated case, and in $O(np + n^2 \log n)$ time for the capacitated case. We show the following new results.

• Given an uncapacitated hypergraph $H$ and an integer $k$ we describe an algorithm that runs in $O(p)$ time to find a (trimmed) subhypergraph $H'$ with sum of degrees $O(kn)$ that preserves all edge-connectivities up to $k$ (a $k$-sparse certificate). This generalizes the corresponding result of Nagamochi and Ibaraki from graphs to hypergraphs. Using this sparsification we obtain an $O(p + \lambda n^2)$ time algorithm for computing a global minimum cut of $H$ where $\lambda$ is the minimum cut value.
• We show that a hypercactus representation of all the global minimum cuts of a capacitated hypergraph can be computed in $O(np + n^2 \log n)$ time and $O(p)$ space matching the asymptotic time to find a single minimum cut.
• We obtain a $(2+\e)$-approximation to the global minimum cut of a capacitated hypergraph in $O(\frac{1}{\e} (p \log n + n \log^2 n))$ time, and for uncapacitated hypergraphs in $O(p/\e)$ time. We achieve this by generalizing Matula's algorithm for graphs to hypergraphs.
• We describe an algorithm to compute approximate strengths of all the edges of a hypergraph in $O(p \log^2 n \log p)$ time. This gives a near linear time algorithm for finding a $(1+\e)$-cut sparsifier based on the work of Kogan and Krauthgamer. As a byproduct we obtain faster algorithms for various cut and flow problems in hypergraphs of small rank.

Our results build upon properties of vertex orderings that were inspired by the maximum adjacency ordering for graphs due to Nagamochi and Ibaraki. Unlike graphs we observe that there are several orderings for hypergraphs and these yield different insights.

description airplay
The shortest kinship description problem

IPL .

We consider a problem in descriptive kinship systems, namely finding the shortest sequence of terms that describes the kinship between a person and his/her relatives. The problem reduces to finding the minimum weight path in a labeled graph where the label of the path comes from a regular language. The running time of the algorithm is $O(n^3+s)$, where $n$ and $s$ are the input size and the output size of the algorithm, respectively.

description airplay
Beating the 2-approximation factor for global bicut

MP .

In the fixed-terminal bicut problem, the input is a directed graph with two specified nodes $s$ and $t$ and the goal is to find a smallest subset of edges whose removal ensures that $s$ cannot reach $t$ and $t$ cannot reach $s$. In the global bicut problem, the input is a directed graph and the goal is to find a smallest subset of edges whose removal ensures that there exist two nodes $s$ and $t$ such that $s$ cannot reach $t$ and $t$ cannot reach $s$. Fixed-terminal bicut and global bicut are natural extensions of $\{s,t\}$-min cut and global min-cut respectively, from undirected graphs to directed graphs. Fixed-terminal bicut is NP-hard, admits a simple $2$-approximation, and does not admit a $(2-\e)$-approximation for any constant $\e>0$ assuming the unique games conjecture. In this work, we show that global bicut admits a $(2-1/448)$-approximation, thus improving on the approximability of the global variant in comparison to the fixed-terminal variant.

description airplay
Reconstructing edge-disjoint paths faster

ORL .

For a simple undirected graph with $n$ vertices and $m$ edges, we consider a data structure that given a query of a pair of vertices $u$, $v$ and an integer $k\geq 1$, it returns $k$ edge-disjoint $uv$-paths. The data structure takes $\tilde{O}(n^{3.375})$ time to build, using $O(mn^{1.5}\log n)$ space, and each query takes $O(kn)$ time, which is optimal and beats the previous query time of $O(kn\alpha(n))$.

description airplay
Champion spiders in the game of Graph Nim

Congr. Numer. .

In the game of Graph Nim, players take turns removing one or more edges incident to a chosen vertex in a graph. The player that removes the last edge in the graph wins. A spider graph is a champion if it has a Sprague-Grundy number equal to the number of edges in the graph. We investigate the the Sprague-Grundy numbers of various spider graphs when the number of paths or length of paths increase.

Manuscripts
Some manuscripts are available upon request.
expand_more description airplay
Minimum violation vertex maps and their applications to cut problems

, Submitted.

expand_more description airplay
A near-linear time algorithm for computing the optimal landing times of a fixed sequence of planes

, Submitted.

expand_more description airplay
High multiplicity asymmetric traveling salesman problem with feedback vertex set and its application to storage/retrieval system

.

description airplay
An algorithm for the metric multiple depots capacitated vehicle routing problem with restocking and capacity two

2018, Submitted.

The capacitated vehicle routing problem (CVRP) is one of the most well known NP-hard combinatorial optimization problem. Single depot CVRP with general metric is NP-hard even for fixed capacity $Q=3$, while polynomial time solvable for fixed capacity $Q\leq 2$. We consider the variant of CVRP where restocking is available. We show that if there is a constant number of depots, then the problem can be solved in polynomial time when $Q=2$. More generally, when there are $p$ depots without a starting vehicle, there is a polynomial time algorithm for constant $p$.

description airplay
Subset Sum Made Simple

.

Subset Sum is a classical optimization problem taught to undergraduates as an example of an NP-hard problem, which is amenable to dynamic programming, yielding polynomial running time if the input numbers are relatively small. Formally, given a set $S$ of $n$ positive integers and a target integer $t$, the Subset Sum problem is to decide if there is a subset of $S$ that sums up to $t$. Dynamic programming yields an algorithm with running time $O(nt)$. Recently, the authors [Koiliaris & Xu SODA '17] improved the running time to $\tilde{O}\bigl(\sqrt{n}t\bigr)$, and it was further improved to $\tilde{O}\bigl(n+t\bigr)$ by a somewhat involved randomized algorithm by Bringmann [Bringmann SODA '17], where $\tilde{O}$ hides polylogarithmic factors. Here, we present a new and significantly simpler algorithm with running time $\tilde{O}\bigl(\sqrt{n}t\bigr)$. While not the fastest, we believe the new algorithm and analysis are simple enough to be presented in an algorithms class, as a striking example of a divide-and-conquer algorithm that uses FFT to a problem that seems (at first) unrelated. In particular, the algorithm and its analysis can be described in full detail in two pages (see pages 3-5).

description airplay
Marking Streets to Improve Parking Density

.

Street parking spots for automobiles are a scarce commodity in most urban environments. The heterogeneity of car sizes makes it inefficient to rigidly define fixed-sized spots. Instead, unmarked streets in cities like New York leave placement decisions to individual drivers, who have no direct incentive to maximize street utilization. In this paper, we explore the effectiveness of two different behavioral interventions designed to encourage better parking, namely (1) educational campaigns to encourage parkers to "kiss the bumper" and reduce the distance between themselves and their neighbors, or (2) painting appropriately-spaced markings on the street and urging drivers to "hit the line". Through analysis and simulation, we establish that the greatest densities are achieved when lines are painted to create spots roughly twice the length of average-sized cars. Kiss-the-bumper campaigns are in principle more effective than hit-the-line for equal degrees of compliance, although we believe that the visual cues of painted lines induce better parking behavior.

Thesis
description airplay
Cuts and Connectivity in Graphs and Hypergraphs

.

In this thesis, we consider cut and connectivity problems on graphs, digraphs, hypergraphs and hedgegraphs. The main results are the following:

• We introduce a faster algorithm for finding the reduced graph in element-connectivity computations. We also show its application to node separation.
• We present several results on hypergraph cuts, including (a) a near linear time algorithm for finding a (2 + ε)-approximate min-cut, (b) an algorithm to find a representation of all min-cuts in the same time as finding a single min-cut, (c) a sparse subgraph that preserves connectivity for hypergraphs and (d) a near linear-time hypergraph cut sparsifier.
• We design the first randomized polynomial time algorithm for the hypergraph $k$-cut problem whose complexity has been open for over 20 years. The algorithm generalizes to hedgegraphs with constant span.
• We address the complexity gap between global vs. fixed-terminal cuts problems in digraphs by presenting a $2-\frac{1}{448}$ approximation algorithm for the global bicut problem.