The Art Gallery Guardian

A Reviewer Assignment Problem


Consider there are some reviewers and some papers. Each reviewer can review exactly one paper, and each reviewer is qualified to review some subset of papers. We are interested in maximizing the number of papers reviewed by at least kk reviewer, then under that constraint, maximize the paper reviewed by k+1k+1 reviewer, etc. This make sure we are being fair in evaluating papers. It would try to avoid the case where most paper getting small number of reviews and a few papers getting unreasonable number of reviews.

Formally, we are given a bipartite graph G=(A,B,E)G=(A,B,E) of nn vertices and mm edges. A subset of edges MM is called a semi-matching, if degM(v)=1\deg_M(v)=1 for all vAv\in A. For a subset of edges MM, let gM(i)g_M(i) to be the number of vertices in BB with degree at least ii. We want to find a semi-matching MM, such that (gM(k),gM(k+1),,gM(n))(g_M(k),g_M(k+1),\ldots,g_M(n)) is lexicographically maximum.

When k=1k=1, if MM minimizes the sum of the function vBf(degM(v))\sum_{v\in B} f(\deg_M(v)) for any strictly convex increasing function ff, then (gM(1),gM(2),,gM(n))(g_M(1),g_M(2),\ldots,g_M(n)) is lexicographically maximum. The problem therefore can be reduced to min-cost flow can be applied here directly, and obtain a polynomial time algorithm [1].

When k=3k=3, the problem is NP-hard, since it would imply we have to maximized gM(3)g_M(3), and this is already NP-hard because exact cover by 33-sets. That is, given a collection of sets of size 33 each. Decide if there exists a subcollection that forms a partition of the universe.

The only unresolved case is k=2k=2. Interestingly, we can show it is also polynomial time solvable. First, one can see that maximizing (gM(2),gM(3),,gM(n))(g_M(2),g_M(3),\ldots,g_M(n)) is equivalent to minimize v,degM(v)2f(degM(v))\sum_{v, \deg_M(v)\geq 2} f(\deg_M(v)) for some strictly convex increasing function vv, and MM range through all semi-matchings so each vertex in BB has degree exactly 22 (Assuming it exists).

Apollonio and Sebő shown the following problem is polynomial time solvable [2].

Problem1

Given a graph G=(V,E)G=(V,E), a integer kk, convex functions fv:NRf_v:\N \to \R for each vVv\in V, and an edge cost function c:ERc:E\to \R. One can find the following in polynomial time. min{vVfv(degM(v))+eMc(e)|ME,M=k}\displaystyle \min \left\{ \sum_{v\in V} f_v(\deg_M(v)) + \sum_{e\in M} c(e) \middle| M\subseteq E, |M|=k \right\}

It's not hard to generalize it a bit further by requiring MM to respect some upper and lower bound on the vertices. Indeed, we can let fv:NR{}f_v:\N\to \R\cup \set{\infty}, and set fv(x)=f_v(x)=\infty if xx is not between the upper and lower bounds.

Now, we are going to reduce the problem. The reduction is similar to the one for 22-or-00 matching.

For each vertex vBv\in B, split into two vertices v1v_1 and v2v_2. Define Bi={v1vB}B_i=\set{v_1|v\in B}. The new input graph consists of vertices B1B2AB_1\cup B_2 \cup A. v1v_1 and v2v_2 connects to the same vertices as vv. We add an edge between v1v_1 and v2v_2, with very large cost CC. Say C=mn2+1C=mn^2+1. v1v_1 has both an upper and lower bound of 11. v2v_2 has a lower bound of 11. For each vertex in AA, add an upper and lower bound of 11. We have a strict convex function fv2(x)=x2f_{v_2}(x)=x^2 on each vertex v2v_2.

Let r=Ar=|A|, p=Bp=|B|. We solve Problem 1 repeatedly for each kk from rr to r+pr+p.

Say there exists an optimal solution to the original problem with exactly tt vertices in BB with degree smaller than 22. Find the optimal solution to the new problem with k=r+tk=r+t. Let it be MM'. We obtain MM from MM' by identify pairs of vertices v1v_1 and v2v_2. MM would be the solution to the original problem.

1 Extensions

I first heard of the problem from [3], where they focused on k=1k=1 case, but the reviewer can review more than one paper. This case can be handled easily. We can add degree upper and lower bounds to all vertices, and only look at subgraphs in MM that satisfies the upper and lower bounds. That is, we can also make sure no reviewers review too many papers too. Under that constraint, find (gM(k),,gM(n))(g_M(k),\ldots,g_M(n)) lexicographically. This is possible but more tricky, as we have to do some reduction from capacitated bb-matching to bb-matching.

There is a little more generalization. Assume for each paper, we have a lower bound of reviews dvd_v. That is, it has to be reviewed by at least dvd_v person to be useful. So translating to the graph case, we can impose the constraint that degM(v)=0\deg_M(v)=0 or degM(v)dv\deg_M(v)\geq d_v. One can see maximizing (gM(2),gM(3),,gM(n))(g_M(2),g_M(3),\ldots,g_M(n)) is equivalent to maximizing (gM(1),gM(2),,gM(n))(g_M(1),g_M(2),\ldots,g_M(n)) where dv=2d_v=2 for all vertices. Again, one can modify the reduction to handle the case when dvd_v is either 11 or 22.

References

[1] N.J. Harvey, R.E. Ladner, L. Lovász, T. Tamir, Semi-matchings for bipartite graphs and load balancing, Journal of Algorithms. 59 (2006) 53–78 10.1016/j.jalgor.2005.01.003.

[2] N. Apollonio, A. Sebő, Minconvex factors of prescribed size in graphs, SIAM Journal on Discrete Mathematics. 23 (2009) 1297–1310 10.1137/060651136.

[3] A. Yeşilçimen, E.A. Yildirim, An alternative polynomial-sized formulation and an optimization based heuristic for the reviewer assignment problem, European Journal of Operational Research. (2019) 10.1016/j.ejor.2019.01.035.

Posted by Chao Xu on .
Tags: algorithm, matching.