# «Gefördert durch die Deutsche Forschungsgemeinschaft im Rahmen des Graduiertenkollegs ’Methods for Discrete Structures’ (GRK 1408) Präsident der ...»

Proof. Let p = χ(H) be the chromatic number of H. For an instance G1,..., Gt of OR(H-Factor), we can assume w.l.o.g. that the Gi are p-partite graphs with n vertices

in each part. We construct a graph G that has an H-factor if and only if some Gi has an H-factor. For this, we invoke the Packing Lemma, Lemma 3.3, with d = 2, and we obtain a p-partite graph P that contains t cliques K1,..., Kt on p vertices each. We identify the vertex set of Gi with V (Ki ) × [n] injectively in such a way that vertices in the same colour class have the same ﬁrst coordinate. We deﬁne an intermediate p-partite graph G on the vertex set V (P ) × [n] as G = G1 ∪ · · · ∪ Gt. To obtain G from G, we √ 1+o(1) add p coordination gadgets of Lemma 3.6 with m = t and d = p. For each colour class C ⊂ V (G ), we add a coordination gadget where the Ui ⊂ C are those vertices that project to the same vertex in P. Finally, we replace each p-hyperedge by the gadget in Lemma 3.10, which ﬁnishes the construction of G.

For the completeness of the reduction, assume Gi has an H-factor M. To construct an H-factor of G, we start by using M to cover the vertices V (Gi ) in G. The completeness of the coordination gadgets guarantees that we ﬁnd a perfect matching in the d-uniform hypergraph G − V (Gi ) that uses only hyperedges of the coordination gadgets. By Lemma 3.10, this gives rise to an H-factor of G.

For the soundness, assume we have an H-factor M of G. Lemma 3.10 guarantees that the edge gadgets can be seen as p-hyperedges in the intermediate graph G. Soundness of the coordination gadgets guarantees that M leaves exactly one group free per part.

Now let H be a copy of H that is contained in G but not in any of the gadgets. Since H has chromatic number p, H intersects all p parts and has an edge between any two distinct parts. By construction of G, this implies that the projection of H onto P is a clique. By the packing lemma, this clique is one of the Ki ’s. Therefore, each H of the H-factor M that is not in one of the gadgets is contained in Gi, which implies that Gi has an H-factor. √ 1+o(1) The claim follows since G is a graph on t poly(s) vertices that has an H-factor if and only if some Gi has an H-factor.

Together with Lemma 1.1, the above implies Theorem 1.7, our kernel lower bounds for H-Factor.

Kernels for Graph Matching Problems The sunﬂower kernelization in Theorem 1.4 immediately transfers to H-Matching for any ﬁxed graph H and yields kernels with O(k d ) edges. For graphs H, Moser [Mos09] shows that H-matching has kernels with O(k d−1 ) vertices where d = |V (H)|, but this gives only the weaker bound O(k 2d−2 ) on the number of edges. Here we show that for some speciﬁc H, we can obtain kernels that are better than the O(k d ) bound implied by Theorem 1.4. As a very simple example, we show this ﬁrst for K1,d -Matching, the problem of packing vertex-disjoint stars with d leaves.

Observation 3.1. K1,d -Matching has kernels with O(k 2 ) edges.

Proof. Let (G, k) be an instance of K1,d -Matching. If G has a vertex v of degree at least dk + 1, let e be an edge incident to v. We claim that we can safely remove e. If G − e has a K1,d -matching of size k, then this also holds for G. For the other direction,

3. Communicating Instances of Hard Problems let M be a K1,d -matching of size k in G. If M does not contain e, it is also a matching of G − e. Otherwise M contains e. Let M be obtained from M by removing the star that contains e. Now v is not contained in M. Since M induces at most d(k − 1) vertices, at least d + 1 neighbors of v are not contained in M. Even if we remove e, we can therefore augment M with a vertex-disjoint star that is centered at v and has d leaves.

This yields a star matching of size k in G − e.

For the kernelization, we repeatedly delete edges incident to high-degree vertices. Then every vertex has degree at most dk. Now we greedily compute a maximal star matching M and answer ’yes’ if M has size k. Otherwise, we claim that the graph has most O(k 2 ) edges: Since M induces at most dk vertices, the degree bound implies that at most (dk)2 edges are incident to M. The vertices of G outside of M have at most d − 1 neighbors outside of M because they would otherwise have been added to M. Thus there are at most (d − 1) · (dk)2 edges not incident to M. Thus G has at most d3 · k 2 edges.

By Theorem 1.7, it is unlikely that star matching problems have kernels with O(k 2− ) edges, so the above kernels are likely to be asymptotically optimal.

3.6. Other Applications To illustrate the use of our oracle communication model we describe two applications in the original framework of Bodlaender et al. [BDFH09].

For several NP-hard parameterized problems L there exists a ≤p -reduction from m OR(L) to L that maps t instances of size s each to a single instance of L of size poly(s) · t1+o(1) and parameter k = poly(s). For example, for problems like Sat and Clique, such reductions follow from the disjoint union construction mentioned in the introduction. For certain other problems such reductions are more involved but still exist (see [CFM07, BTY09, DLS09, FFL+ 09, KW10, KW09, KMW10] for examples).

Whenever such reductions exist, Lemma 3.1 implies that L does not have an oracle communication protocol of cost poly(k) unless coNP ⊆ NP/poly. In particular, such problems do not have kernels of polynomial size unless coNP ⊆ NP/poly.

Turing kernelizations Fernau et al. [FFL+ 09] exhibit a parameterized problem that has no standard kernel of polynomial size unless coNP ⊆ NP/poly, but does have a “Turing kernelization” of size O(k 3 ) in the following sense: The problem has a self-reduction which, on inputs of size s and parameter k, makes at most s queries, all of which are of size O(k 3 ). Using oracle communication protocols that implement general reductions rather than mapping reductions, Lemma 3.1 allows us to rule out the following for that problem, assuming coNP ⊆ NP/poly: Reductions that, on inputs of size s and parameter k, make at most s1− queries for some positive real and only query instances of bitlength bounded by a polynomial in k. In particular, this shows that the number of queries in the Turing kernel of Fernau et al. [FFL+ 09] is likely to be tight – reducing it from s to s1− for some positive real would collapse the polynomial-time hierarchy.

Density of NP-hard languages Buhrman and Hitchcock [BH08b] showed that a language S that contains no more strings of any length n cannot be hard for NP under reductions that make n1− than 2n o(1) queries for some positive real unless coNP ⊆ NP/poly. The proof in [BH08b] is a modiﬁcation of the proof in [FS08]. As an illustration of the power of our oracle communication protocols, we show that this result immediately follows from Lemma 3.1 using an oracle that actively tries to extract enough information from the ﬁrst player to decide the membership to S of any query that the ﬁrst player wants to make.

Suppose such an NP-hard language S does exist and consider the reduction from Sat to S that makes n1− queries. Since the reduction runs in polynomial time, the size of the queries is bounded by m = poly(n). Consider the lexicographic ordering of all strings of length up to m. The set S breaks up this ordering into at most 2 · |S ∩ {0, 1}≤m | + 1 intervals on which the membership to S is constant. In order for the oracle to decide the membership to S of a query, it suﬃces for the oracle to ﬁgure out which interval the query falls in. It can do so by running a binary search with the help of the ﬁrst player, who knows the exact query. The binary search only takes log(2 · |S ∩ {0, 1}≤m | + 1) bits of communication from the ﬁrst player to the oracle. Overall, this leads to a communication protocol for Sat of cost O(n1− · log(2 · |S ∩ {0, 1}≤m | + 1)) = O(n1− +o(1) ). Combining this protocol with the ≤p -reduction from OR(Sat) to Sat mentioned above, we obtain m an oracle communication protocol for OR(Sat) of cost O(poly(s) · t1− +o(1) ) on inputs consisting of t instances of size s each. As the latter quantity is O(t log t) whenever t is a suﬃciently large polynomial in s, Lemma 3.1 implies that coNP ⊆ NP/poly.

Notes The complementary witness lemma in §3.1, the AP-free set based proof in §3.2, the corollaries for satisﬁability in §3.3 and for covering problems §3.4, and the discussion in §3.6 are joint work with Dieter van Melkebeek and appeared in [DvM10]. The elementary proof in §3.2 and the results about packing problems in §3.5 is unpublished joint work with Dániel Marx.

4. Packing Edge-disjoint Cliques In this chapter we establish the Packing Lemma, Lemma 3.3, a graph-theoretical result that is useful for proving the hardness of oracle communication. It is a critical ingredient in the original proof of Theorem 1.2, the hardness of vertex cover, that appeared in [DvM10] and is spelled out in §3.2. Even though we later found a proof of Theorem 1.2 that does not use the Packing Lemma, it remains an important tool in the area. For example, we used it to prove Theorem 1.7, the hardness of cost O(n2− ) protocols for H-packing problems.

Our Construction We ﬁrst develop the construction for the case d = 2, i.e., for standard graphs, and then show how to generalize it to d-uniform hypergraphs for arbitrary d ≥ 2. We also discuss the relationship of our construction to earlier ones. We need to construct a graph P on few vertices such that (i) the edges of P partition into t cliques K1,..., Kt on p vertices each, and (ii) P contains no other cliques on p vertices.

We ﬁrst focus on realizing condition (i) and then see how to modify the construction to also realize (ii).

We construct P as an p-partite graph and think of the p partitions as the columns of a two-dimensional array of vertices, say of size q by p. Each of the Ki ’s then contains exactly one vertex from each of the p columns. Condition (i) expresses that P is a packing of the Ki ’s. The trivial packing consists of the disjoint union and requires q = t rows, resulting in p · t vertices in total. The trivial packing is wasteful because it leaves many of the potential edges unused. In an ideal packing each of the q 2 potential edges between two columns of the array are assigned to some Ki. This would only require a √ √ number of rows q = t and therefore p · t vertices. We can realize such a tight packing by picking the vertex of Ki in column j as the value of j under a hash function hi from a minimum 2-universal family. If q is a prime at least p, we can identify the rows as well as the columns with elements of Fq and use the family of linear functions over Fq.

More precisely, we construct P on the vertex set V (P ) = [p] × Fq as the union of the t cliques Ki on the vertex sets V (Ki ) = {(j, hi (j)) | j ∈ [p]}, where hi is a linear function over Fq uniquely associated with Ki. See Figure 4.1a. Note that there are q 2 distinct linear functions hi over Fq, so we can accommodate that many cliques Ki. Moreover, since two points deﬁne a line, every edge of P is contained in exactly one of the Ki ’s. For

4. Packing Edge-disjoint Cliques

** Figure 4.1.**

: (a) The placement of one of the Ki ’s. (b) Triangle on three consecutive abscissae.

√ arbitrary values of p and t, we can pick q to be the ﬁrst prime q ≥ max(p, t), resulting √ in a packing with O(p · max(p, t)) vertices.

Note that this P is in fact a complete p-partite graph and therefore fails to satisfy condition (ii) miserably – every clique of size p that has one vertex from each column is present in P, which is many more than just the Ki ’s. In order to remedy that problem, let us analyze the cliques of size p in P more closely.

Let K denote a clique of size p in P. Each of the p columns of P has to contain exactly one vertex of K, i.e., there exists a function h : [p] → Fq such that V (K) = {(j, h(j)) | j ∈ [p]}. We would like to ensure that K coincides with one of the cliques Ki, or equivalently, that the function h coincides with one of the linear functions hi.

Consider three consecutive columns, j, j +1, and j +2, and the triangle that K induces between them – see Figure 4.1b, where each edge is labeled by the linear function hi deﬁning the clique Ki to which the edge belongs. We claim that the highest-order coeﬃcients of those linear functions have to form an arithmetic progression. This follows by considering the two paths in Figure 4.1b that go from the vertex in column j to the one in column j + 2. The direct path on top involves an increase in y-value of 2a2, whereas the indirect path on the bottom involves an increase in y of a3 followed by an increase of a1. Since both paths end up at the same point, we have that

** 2a2 = a1 + a3, (4.1)**

or equivalently, that a3 − a2 = a2 − a1, or yet equivalently, that the sequence a1, a2, a3 forms an arithmetic progression. If we restrict the highest-order coeﬃcients of the linear functions to come from a subset A ⊆ Fq that contains no nontrivial arithmetic progressions of length three, the arithmetic progression a1, a2, a3 has to be trivial, i.e., a1 = a2 = a3. The latter implies that the three lines in Figure 4.1b coincide. As this implication holds for all choices of three consecutive columns, we conclude that all vertices of K lie on a single line deﬁned by one of the hi ’s, as we wanted.

Of course, the additional restriction on the highest-order coeﬃcients means that we need to choose q larger. However, we only need to increase q slightly thanks to the existence of eﬃciently constructible subsets A ⊆ Fq of high density that contain no nontrivial arithmetic progressions of length three. For our purposes the following classical result from additive combinatorics suﬃces.

** Lemma 4.1 (AP3 -Free Sets [SS42]).**

For every positive integer q there exists a subset A ⊆ Zq of size at least q 1−o(1) which contains no nontrivial arithmetic progressions of length three. Furthermore, such a set A can be determined in time polynomial in q.

For completeness we provide a proof of Lemma 4.1 in the Appendix. The resulting √ 1+o(1) graph P has p · q vertices where q = O max p, t.

This ﬁnishes the construction of the packing lemma for the case of standard graphs.