The Art Gallery Guardian

Selection in a sorted matrix


A matrix is sorted if every row and column are non-increasing.

A common interview problem asks for the kkth smallest number in a sorted matrix, and usually people give an O(klogk)O(k\log k) algorithm. Wlog we can assume all the numbers in the matrix are distinct, as we can always break the ties by factoring in the position of the number in the matrix.

There is a O(k)O(k) time solution. In fact, O(min(k,m))O(\min(k, m)) if it’s a n×mn\times m matrix and nmn\leq m.

I’m frustrated that there isn’t a good description of such an algorithm. The most common reference people provide is [1]. However there are a few downsides of that paper. It only works when n=mn=m and it is still a bit complicated. So here I will give a short description to a modified version of the algorithm.

One can see the idea closely resemble what happens in fractional cascading, we basically squeeze the unknown values between known values, so we don’t have to look at most of the unknown values.

First, we assume the matrix we care about is a n×mn\times m matrix AA. Both nn and mm are power of 22 and nmn\leq m.

Let AoA_o be the matrix we get by removing all even index columns from AA, and add the last column.

In [1], they used Ao,oA_{o,o}, which is defined as only retain positions with odd coordinates, but I found it nicer to consider things by only stripping one coordinate. In particular, it gave a much nicer proof for the following result.

Definition1

r(a,A)r(a,A) is the number of elements in matrix AA smaller or equal to aa.

Theorem2

Let AA be a sorted n×mn\times m matrix, then 2(r(a,Ao)n)r(a,A)2r(a,Ao)2(r(a,A_o) - n) \leq r(a,A)\leq 2r(a,A_o).

Proof

For any fixed ii, let f(i)f(i) be the largest jj, such that Ai,jaA_{i,j}\geq a. r(a,A)=i=1nf(i)r(a,A)=\sum_{i=1}^n f(i), r(a,Ao)=i=1nf(i)/2r(a,A)/2+nr(a,A_o)=\sum_{i=1}^n \lceil f(i)/2 \rceil \leq r(a,A)/2 +n. On the other hand i=1nf(i)/2r(a,Ao)\sum_{i=1}^n f(i)/2 \leq r(a,A_o).

This means if we want to find an element of rank kk in AA, we can first find element of rank k/2+nk/2+n and k/2k/2 in AoA_o, and we know the solution would be in between. The remaining operation takes O(m)O(m) time:

  1. Use a flood fill starting from the position of k/2k/2th element in AoA_o and find at most 2n2n positions where the element of rank kk could reside. Namely it select elements in AA that is in between the k/2k/2th element and k/2+nk/2+nth in AoA_o. We can do this because all these elements are connected in a component.
  2. While doing the flood fill, it can also find the rank of the k/2k/2th element in AoA_o in the matrix AA for no extra cost.
  3. A linear time selection algorithm on all the elements resulted from the flood fill.

This would give us a recursive algorithm if instead of just finding kkth number, it finds the k1k_1th and k2k_2th number at the same time. As long as k1k2=O(m)|k_1-k_2|=O(m), we can find them both only with a O(m)O(m) extra time. k1k_1 and k2k_2 will be the upper and lower bounds respectively. Some basic algebra shows k1k2=O(m)|k_1-k_2|=O(m) inside each recursive step if we start with k1=k2=kk_1=k_2=k.

Let T(n,m)T(n,m) be the time used when the matrix is of size n×mn\times m, and nmn\leq m. Certainly T(1,m)=1T(1,m)=1, and the rest follow the recursive relation T(n,m)=cm+T(n,m/2)T(n,m) = cm + T(n,m/2) for some constant cc.

If n>mn>m, we can rotate the matrix implicitly to get the same running time.

Solving it gives us the desired running time O(m)O(m). This is also a O(k)O(k) time algorithm because we only need to consider the k×kk\times k submatrix in the up-left position.

A Haskell implementation here, it requires a linear time rank selection algorithm selectRank.

Note a slightly faster algorithm also exists when nn and mm are very different. It was shown selecting the kkth element can be done in O(nlogm/n)O(n\log m/n) time [2].

References

[1] A. Mirzaian, E. Arjomandi, Selection in x + y and matrices with sorted rows and columns, Information Processing Letters. 20 (1985) 13–17.

[2] G. Frederickson, D. Johnson, Generalized selection and ranking: Sorted matrices, SIAM Journal on Computing. 13 (1984) 14–30 10.1137/0213002.

Posted by Chao Xu on .
Tags: Algorithm.