# Induced subgraph with constant diameter

Let T be a rooted tree on vertices V. T_v to be the subtree of T rooted at v. Let \mathbb{G}_T^k contains all graphs G on V with the property that for every v, the diameter of G[V(T_v)] is at most k.

Let G_T^k to be the graph in \mathbb{G}_T^k with minimum number of edges(break ties arbitrarily), what can we say about the number of edges in G_T^2?

If T has n vertices, then G_T^2 have O(n \log n) edges.

Let |G| to be the number of vertices in |G|.

We construct a graph G\in \mathbb{G}_T^2 recursively. For each T_v, we construct a G_v on the same vertex set such that it has the property there exist a vertex that connects to all other vertex in G_v. Call such vertex a golden vertex of G_v. G_v would be a union of all G_u, where u is a child of T_v with some extra edges. Let u to be the child of v such that G_u has the maximum number of vertices (break ties arbitrarily), and u' is the golden vertex of G_u, then we add edges from u' to all other vertices in G_v. Let G=G_r, where r is the root of T.

The distance between any two vertex is at most 2 in G[V(T_v))]=G_v for all v, because any two vertex is connected by a golden vertex in the induced subgraph.

The number of edges is at most O(n\log n). Label v with |T_v|. The number of edges added at the step to build G_v is |T_v|-|T_u|, where G_u contains the golden vertex, and it also have the most number of vertices among all G_c where c is a child of v in T. Sum of all labels over the tree T except the heaviest child is O(n\log n) using analysis akin to heavy-light decomposition.

This bound is in fact best possible by considering a complete binary tree and the following lemma:

If G=(V,E) is a graph with diameter 2, then for any cut \delta(U), we have |\delta(U)|\geq min(|U|,|V\setminus U|).

wlog, let |U|\leq |V\setminus U|. If all vertex in U has an neighbor in V\setminus U, then |\delta(U)|\geq |U| and we are done. Otherwise, there is an vertex u\in U has no neighbor in V\setminus U, then for any v\not\in U, there is an u'\in U, and there is a path uu'v. This implies there are at least |V| edges in \delta(U).

Is this the best possible for general G_T^k too? Not at all. To show this, we introduce an algorithmic problem.

Preprocess a rooted tree T where the elements in the tree T has weights from a semigroup (S,+), such that we can answer the following query in k semigroup operations.

**Query Input**: x and y where x is an ancestor of y.

**Query Output**: The sum of weights of the unique simple path between x and y.

Alon and Schieber showed this problem can be solved by precompute O(n\lambda(k,n)) path sums [1]. Where \lambda(k,n) is related to the inverse Ackermann function, and \lambda(4,n)=O(n\log^*n).

If T has n vertices, then G_T^k have O(n \lambda(k,n)) edges.

A simple reduction from our tree problem by input our rooted tree T into Alon and Schieber's algorithm. For each path sum they preprocess, we create an edge in our G. The diameter of all G_v is at most 2k, since for any 2 vertices x and y, we find it's lowest common ancestor w in T, then there is a path of length k from x to w and another of length at most k from w to y.

Their paper argued an lower bound of preprocess \Omega(n\lambda_k(n)) path sums for a single rooted path. In our model, we can add O(n) edges is sufficient. An edge from the leaf to every other vertex, making the diameter only 2. This is because we have a lot more freedom in our diameter problem.

In their problem, their path sum is actually divide the path into k subpaths, our path of length 2 would imply there is an overlap in the sum.

One might to think the reason we can do it in O(n) edges comes directly from in the undirected model we can traverse back from the leaf and allow overlaps. Maybe the following problem might give us an lower bound.

Preprocess a rooted tree T where the elements in the tree T has weights from a semigroup (S,+), such that we can answer the following query in k semigroup operations.

**Query Input**: x and y where x is an ancestor of y and y is a leaf.

**Query Output**: The sum of weights of any walk lies strictly in between xy and that start at x and ends at y.

Unfortunately, this does not give us an lower bound. Following the proof in Alon's and Schieber's paper, we can show any algorithm would need to precompute at least O(n\lambda_k(n)) paths in order to output the solution using k semigroup operations on this particular tree: T is a rooted tree consist of one single path of length n, and for each vertex, we add one single dangling leaf.

Again, for this graph, we can add at most n edges and force diameter to be at most 2. Take the lowest leaf and add edge to all other vertices. Here would be an formulation that could result an lower bound.

Preprocess a rooted tree T where the elements in the tree T has weights from a semigroup (S,+), such that we can answer the following query in k semigroup operations.

**Query Input**: x and y where x is an ancestor of y.

**Query Output**: The sum of weights of any walk from x to y that lies strictly inside the subtree rooted at x.

Finally, here is a conjecture. Instead of using tree, we use Laminar family to simplify the wording of the conjecture.

For every constant k, there exist a Laminar family X over ground set V of n vertices, such that for any graph G=(V,E) such that G[A] has diameter at most k for all A\in X, then |E|=\Omega(n\lambda_k(n)).

If diameter is 3, then we can show there is a \Omega(n\log \log n) lower bound by consider the Laminar family such that a set of size n gets partitioned into \sqrt{n} children each with \sqrt{n} vertices, and it is applied recursively.

The idea is at any level, say has n elements and \sqrt{n} subsets. A vertex is called interior if it does not connect to any vertex outside the subset, and exterior otherwise. A subset is shy if it contains a interior vertex. We show there is at least 1/5 n edges in between the subsets. If there are at least 1/2n exterior vertices, then we are done. Otherwise, there are at lease 1/2n interior vertices. We call the subsets with at least one interior vertex shy. At least \sqrt{n}/2 of those subsets are shy. Consider any two distinct shy subset and pick a interior vertex from each. In order to have a distance 3 path between two interior vertices, the path has to contain two exterior vertex, one in each shy subset. But this shows if we contract all the shy subsets to one vertex, we get a clique. This means there has to be at least (\sqrt{n}/2-1)^2>1/5n edges in between the subsets (for large enough n).

# References

[1] N. Alon, B. Schieber, **Optimal preprocessing for answering on-line product queries**, Tel Aviv University, 1987.