I'm a research scientist at Yahoo! Research in New York. My research interests are algorithms, combinatorial optimization and computational geometry. Here is my CV and old research statement.
In May 2018, I finished my PhD in Computer Science from University of Illinois at UrbanaChampaign. My advisors were Karthik Chandrasekaran and Chandra Chekuri. I did my undergrad in mathematics at Stony Brook University.
I maintain a github account and a blog. I frequent cstheory.
You can contact me through email chao.xu@oath.com.
Recent Manuscripts * Some manuscripts are available upon request.  
Subset Sum Made Simple
Subset Sum is a classical optimization problem taught to undergraduates as an
example of an NPhard problem, which is amenable to dynamic programming,
yielding polynomial running time if the input numbers are relatively small.
Formally, given a set $S$ of $n$ positive integers and a target integer $t$,
the Subset Sum problem is to decide if there is a subset of $S$ that sums up to
$t$. Dynamic programming yields an algorithm with running time $O(nt)$.
Recently, the authors [Koiliaris & Xu SODA '17] improved the running time to
$\tilde{O}\bigl(\sqrt{n}t\bigr)$, and it was further improved to
$\tilde{O}\bigl(n+t\bigr)$ by a somewhat involved randomized algorithm by
Bringmann [Bringmann SODA '17], where $\tilde{O}$ hides polylogarithmic factors.
Here, we present a new and significantly simpler algorithm with running time $\tilde{O}\bigl(\sqrt{n}t\bigr)$. While not the fastest, we believe the new algorithm and analysis are simple enough to be presented in an algorithms class, as a striking example of a divideandconquer algorithm that uses FFT to a problem that seems (at first) unrelated. In particular, the algorithm and its analysis can be described in full detail in two pages (see pages 35). 

Minimum violation vertex maps and their applications to cut problems


A Polynomial Time Algorithm to Minimize Total Travel Time in $k$Depot Storage/Retrieval System


A nearlinear time algorithm for computing the optimal landing times of a fixed sequence of planes


Conference Publications  
LP Relaxation and Tree Packing for Minimum $k$cuts
Karger used spanning tree packings to derive a near
lineartime randomized algorithm for the global minimum cut problem
as well as a bound on the number of approximate minimum cuts. This
is a different approach from his wellknown random contraction
algorithm. Thorup developed a fast
deterministic algorithm for the minimum $k$cut
problem via greedy recursive tree packings.
In this paper we revisit properties of an LP relaxation for $k$cut proposed by Naor and Rabani, and analyzed by Chekuri, Guha and Naor. We show that the dual of the LP yields a tree packing, that when combined with an upper bound on the integrality gap for the LP, easily and transparently extends Karger's analysis for mincut to the $k$cut problem. In addition to the simplicity of the algorithm and its analysis, this allows us to improve the running time of Thorup's algorithm by a factor of $n$. We also improve the bound on the number of $\alpha$approximate $k$cuts. Second, we give a simple proof that the integrality gap of the LP is $2(11/n)$. Third, we show that an optimum solution to the LP relaxation, for all values of $k$, is fully determined by the principal sequence of partitions of the input graph. This allows us to relate the LP relaxation to the Lagrangean relaxation approach of Barahona and Ravi and Sinha; it also shows that the idealized recursive tree packing considered by Thorup gives an optimum dual solution to the LP. This work arose from an effort to understand and simplify the results of Thorup. 

Hypergraph $k$Cut in Randomized Polynomial Time In the hypergraph $k$cut problem, the input is a hypergraph, and the goal is to find a smallest subset of hyperedges whose removal ensures that the remaining hypergraph has at least $k$ connected components. This problem is known to be at least as hard as the densest $k$subgraph problem when k is part of the input (ChekuriLi, 2015). We present a randomized polynomial time algorithm to solve the hypergraph $k$cut problem for constant $k$. Our algorithm solves the more general hedge $k$cut problem when the subgraph induced by every hedge has a constant number of connected components. In the hedge $k$cut problem, the input is a hedgegraph specified by a vertex set and a disjoint set of hedges, where each hedge is a subset of edges defined over the vertices. The goal is to find a smallest subset of hedges whose removal ensures that the number of connected components in the remaining underlying (multi)graph is at least $k$. Our algorithm is based on random contractions akin to Karger's min cut algorithm. Our main technical contribution is a distribution over the hedges (hyperedges) so that random contraction of hedges (hyperedges) chosen from the distribution succeeds in returning an optimum solution with large probability. 

Global and fixedterminal cuts in digraphs
The computational complexity of multicutlike problems may vary significantly depending on whether the terminals are fixed or not. In this work we present a comprehensive study of this phenomenon in two types of cut problems in directed graphs: double cut and bicut.


A Faster Pseudopolynomial Time Algorithm for Subset Sum
Given a multiset $S$ of $n$ positive integers and a target integer $t$, the subset sum problem is to decide if there is a subset of $S$ that sums up to $t$. We present a new divideandconquer algorithm that computes all the realizable subset sums up to an integer $u$ in $\tilde{O}\left(\min\{n\sqrt{u},u^{4/3},\sigma\}\right)$, where $\sigma$ is the sum of all elements in $S$ and $\tilde{O}$ hides polylogarithmic factors. This result improves upon the standard dynamic programming algorithm that runs in $O(nu)$ time. To the best of our knowledge, the new algorithm is the fastest general algorithm for this problem. We also present a modified algorithm for cyclic groups, which computes all the realizable subset sums within the group in $\tilde{O}\left(\min\{n\sqrt{m},m^{5/4}\}\right)$ time, where m is the order of the group.


Computing minimum cuts in hypergraphs
We study algorithmic and structural aspects of connectivity in hypergraphs. Given a hypergraph $H=(V,E)$ with $n=V$, $m=E$ and $p=\sum_{e\in E}e$ the best known algorithm to compute a global minimum cut in $H$ runs in time $O(np)$ for the uncapacitated case and in $O(np+n^2\log n)$ time for the capacitated case. We show the following new results.


On ElementConnectivity Preserving Graph Simplification
The notion of elementconnectivity has found several important applications in network design and routing problems. We focus on a reduction step that preserves the elementconnectivity, which when applied repeatedly allows one to reduce the original graph to a simpler one. This preprocessing step is a crucial ingredient in several applications. In this paper we revisit this reduction step and provide a new proof via the use of setpairs. Our main contribution is algorithmic results for several basic problems on elementconnectivity including the problem of achieving the aforementioned graph simplification. We utilize the underlying submodularity properties of elementconnectivity to derive faster algorithms.


Detecting Weakly Simple Polygons
A closed curve in the plane is weakly simple if it is the limit (in the Fréchet metric) of a sequence of simple closed curves. We describe an algorithm to determine whether a closed walk of length n in a simple plane graph is weakly simple in $O(n \log n)$ time, improving an earlier $O(n^3)$time algorithm of Cortese et al.. As an immediate corollary, we obtain the first efficient algorithm to determine whether an arbitrary nvertex polygon is weakly simple; our algorithm runs in $O(n^2 \log n)$ time. We also describe algorithms that detect weak simplicity in $O(n \log n)$ time for two interesting classes of polygons. Finally, we discuss subtle errors in several previously published definitions of weak simplicity.
Dedicated with thanks to our colleague Ferran Hurtado (1951–2014). 

Journal Publications  
Minimum cuts and sparsification in hypergraphs
We study algorithmic and structural aspects of connectivity in
hypergraphs. Given a hypergraph $H=(V,E)$ with $n = V$, $m = E$
and $p = \sum_{e \in E} e$ the fastest known algorithm to compute a
global minimum cut in $H$ runs in $O(np)$ time for the uncapacitated
case, and in $O(np + n^2 \log n)$ time for the capacitated case. We show
the following new results.


The shortest kinship description problem
We consider a problem in descriptive kinship systems, namely finding the shortest sequence of terms that describes the kinship between a person and his/her relatives. The problem reduces to finding the minimum weight path in a labeled graph where the label of the path comes from a regular language. The running time of the algorithm is $O(n^3+s)$, where $n$ and $s$ are the input size and the output size of the algorithm, respectively.
To the memories of Jiaqi Zhao(1994–2016). 

Beating the 2approximation factor for global bicut
In the fixedterminal bicut problem, the input is a directed graph with two specified nodes s and t and the goal is to find a smallest subset of edges whose removal ensures that s cannot reach t and t cannot reach s. In the global bicut problem, the input is a directed graph and the goal is to find a smallest subset of edges whose removal ensures that there exist two nodes s and t such that s cannot reach t and t cannot reach s. Fixedterminal bicut and global bicut are natural extensions of $\{s,t\}$min cut and global mincut respectively, from undirected graphs to directed graphs. Fixedterminal bicut is NPhard, admits a simple $2$approximation, and does not admit a $(2−\epsilon)$approximation for any constant $\epsilon>0$ assuming the unique games conjecture. In this work, we show that global bicut admits a $(2−1/448)$approximation, thus improving on the approximability of the global variant in comparison to the fixedterminal variant.


Reconstructing edgedisjoint paths faster
For a simple undirected graph with $n$ vertices and $m$ edges, we consider a data structure that given a query of a pair of vertices $u$, $v$ and an integer $k\geq 1$, it returns $k$ edgedisjoint $uv$paths. The data structure takes $\tilde{O}(n^{3.375})$ time to build, using $O(mn^{1.5}\log n)$ space, and each query takes $O(kn)$ time, which is optimal and beats the previous query time of $O(kn\alpha(n))$.


Champion spiders in the game of Graph Nim
In the game of Graph Nim, players take turns removing one or more edges incident
to a chosen vertex in a graph. The player that removes the last edge in the graph
wins. A spider graph is a champion if it has a SpragueGrundy number equal to the
number of edges in the graph. We investigate the the SpragueGrundy numbers of
various spider graphs when the number of paths or length of paths increase.


Manuscripts  
Marking Streets to Improve Parking Density
Street parking spots for automobiles are a scarce commodity in most urban environments. The heterogeneity of car sizes makes it inefficient to rigidly define fixedsized spots. Instead, unmarked streets in cities like New York leave placement decisions to individual drivers, who have no direct incentive to maximize street utilization.
In this paper, we explore the effectiveness of two different behavioral interventions designed to encourage better parking, namely (1) educational campaigns to encourage parkers to "kiss the bumper" and reduce the distance between themselves and their neighbors, or (2) painting appropriatelyspaced markings on the street and urging drivers to "hit the line". Through analysis and simulation, we establish that the greatest densities are achieved when lines are painted to create spots roughly twice the length of averagesized cars. Kissthebumper campaigns are in principle more effective than hittheline for equal degrees of compliance, although we believe that the visual cues of painted lines induce better parking behavior.


Thesis  
Cuts and Connectivity in Graphs and Hypergraphs
In this thesis, we consider cut and connectivity problems on graphs, digraphs, hypergraphs and hedgegraphs.
The main results are the following:
Coadvised by Karthik Chandrasekaran and Chandra Chekuri. 