Improved lower bounds for multicolour diagonal Ramsey numbers

Big news in combinatorics today: David Conlon and Asaf Ferber have posted a 4-page preprint on arXiv that gives exponential improvements in the lower bounds on multicolour diagonal Ramsey numbers, when the number of colours is at least 3 (also see the post by Gil Kalai). The best known lower bounds so far were completely probabilistic, combined with an inductive argument of Leffman that gave better bounds for \geq 4 colours. The new construction on the other hand has a deterministic part to it, which in fact uses finite geometry. In this post I’ll describe their construction, and also show how to improve their bounds further using a simple observation.

The multicolour Ramsey number R(t; \ell) is the minimum n such that any colouring of the edges of K_n using \ell colours gives rise to a monochromatic K_t, which is known to exist by Ramsey’s theorem. Ignoring the lower order terms (which is what the big recent breakthrough of Ashwin Sah improved for \ell = 2), the best upper bound we have for all \ell \geq 3 is R(t; \ell) \leq \ell^{\ell t}, and improving the base of the exponent from \ell^\ell to anything smaller seems out of reach by our current methods. As far as the lower bounds are concerned, the best lower bounds for \ell = 3, 4 are R(t; 3) \geq (\sqrt{3})^t and R(t;4) \geq 2^t, as shown by Erdős in 1940’s. For larger \ell, one can improve the lower bound of R(t;\ell) \geq \sqrt{\ell}^t by using the following observation of Leffman:

R(t; \ell_1 + \ell_2) - 1 \geq (R(t;\ell_1) - 1)(R(t;\ell_2) - 1).

Conlon and Ferber improve all of these bounds as a corollary to the following result:

Theorem 1. For any prime p, R(t; p + 1) > 2^{t/2} p^{t/3 + o(t)}.

This, along with Leffman’s observation, gives an exponential improvement for the known lower bound on R(t;\ell) for all \ell \geq 3.

As the appearance of the prime p in this result might suggest, it uses finite fields. In fact, it uses the basic theory of polar spaces/quadratic forms over finite fields. By giving more precise estimate on a certain count in their proof, I can show the following:

Theorem A. For any prime p, R(t; p + 1) > 2^{t/2} p^{3t/8 + o(t)}.

Note that this is an exponential improvement. Let’s see how we can prove this.

Let p be a prime, and let V be the set of all isotropic vectors in \mathbb{F}_p^t with respect to the symmetric bilinear form B(x, y) = x_1 y_1 + \cdots + x_t y_t. In other words, V = \{(x_1, \dots, x_t) \in \mathbb{F}_p^t : x_1^2 + \cdots + x_t^2 = 0\}. It is well known, and in fact not so hard to show, that |V| \sim p^{t - 1}. Though all we need for the bounds is that |V| = p^{t - O(1)}, which is what Conlon and Ferber show in their paper (more precisely, they show |V| \geq p^{t-2}). Consider the following edge colouring of the complete graph on V: the edge \{x, y\} is coloured with i \in \mathbb{Z} if B(x, y) = i  \pmod{p} with i \neq 0, and if B(x, y) = 0 then it is coloured from the set \{p, p + 1\} uniformly at random and independently for all such edges. Thus we have a (p + 1)-colouring of the complete graph on |V| vertices, where the colours are just the integers 1, \dots, p + 1.

In each colour 1 \leq i \leq p - 1 we can show algebraically that the graph on these edges has no cliques of size larger than t, assuming t is not divisible p, which we will assume (for the exact argument see the paper of Conlon and Ferber). Therefore, we are aiming for a lower bound on R(t+1; \ell), but asymptotically it would be the same for R(t; \ell). Now with the random colouring in the last two colours, Conlon and Ferber show that for n = 2^{t/2}p^{t/3 + o(t)}, there exists a subset of V of size n such that the complete graph on these vertices, with the same edge-colouring as before, has no monochromatic clique of size t in colours p or p + 1. This is done by choosing a random subset where each vertex is chosen with a certain probability which is a function of n. Therefore, we have a random colouring of certain pairs of vectors, and a random subset of all the isotropic vectors.

A crucial step in the estimation used in the proof, that leads to this particular choice of n, is an upper bound on the number of cliques of size t in the graph defined on V where two vectors are adjacent if they are orthogonal to each other, that is, x \sim y if B(x, y) = 0. Note that in our edge colouring, the edges of this graph were the ones that got colours from \{p, p + 1\}. Let’s denote this number by N_t. Each such clique is a collection x_1, \dots, x_t of vectors in \mathbb{F}_p^t such that B(x_i, x_j) = 0 for all 1 \leq i \leq j \leq t. Conlon and Ferber estimate the number N_{t, r} of such cliques that have rank r, that is, the dimension of the space spanned by x_1, \dots, x_t is r, by N_{t, r} \leq p^{2tr - 3r^2/2 + r/2}, and then just use the upper bound N_t = \sum_{r = 1}^{t} N_{t, r} \leq t \max_{r = 1}^t N_{t, r}.

Until this point it’s all fine, but what we can improve upon is the argument that follows. They say that N_{t, r} is maximized for r = 2t/3 + 1/6 (which is not at all hard to see) and thus conclude that N_t \leq p^{2t^2/3 + o(t^2)}. But they miss a crucial linear algebraic fact at this point. The rank of a clique can never be larger than t/2. This is an easy fact that follows from (i) \dim S + \dim S^\perp = t for any subspace S, and (ii) S \subseteq S^\perp for a totally isotropic subspace, which is exactly what the span of the vectors forming the clique is going to be. In terms of the theory of polar spaces (or quadratic forms), it says that the rank of a polar space is always at most half of the dimension of the underlying vector space (something all the finite geometers must be familiar with). From this we see that N_t \leq t N_{t, t/2} \leq p^{5t^2/8 + o(t^2)}. After this, if you do the calculation of Conlon-Ferber then you’ll see that we can take n = p^{3t^2/8  + o(t^2)}, thus proving Theorem A. \square

Remark: If you are familiar with the theory of polar spaces, then you can easily estimate this number N_t, as each such clique must be contained in a generator (maximal totally isotropic subspaces) and you can count the total number of generators (see for example Theorem 4.11 in the book by Simeon Ball). Depending on the type of the polar space, you can even get an exact count but here we are just interested in the asymptotic results.

What’s next?

I think it’s really exciting that finite geometry, or in fact anything not completely probabilistic, can be used in the case of diagonal Ramsey numbers to improve the known bounds. This was recently shown to be true for the two-colour off-diagonal case by Mubayi and Verstraete, which was extended to multicolours by He and Wigderson, as I describe here and here. Therefore, finite geometry is paying a crucial role in developing Ramsey theory. I think a deeper understanding of these constructions, and new ideas from finite geometry (perhaps group theoretic ones) can lead to even further progress.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , | 19 Comments

The dual version of Ryser’s conjecture

I talked about our new results related to Ryser’s conjecture in a previous post (also see an even earlier post). The conjecture, and its variants, have some interesting equivalent formulations in terms of edge colourings of graphs. While I was vaguely aware of that, I never looked into it in detail. Thanks to the new paper of DeBiasio, Kamel, McCourt and Sheats, I have now decided to study it properly (and share what I understand). In this post I would like to explain and prove this so-called duality.

Consider the following two statements (and try to prove them without reading forward):

(1) Given n points in \mathbb{R}^2, let k be the maximum number of points that you can choose from these such that no two points share an x or a y coordinate. Then these n points can be covered by k axis parallel lines, i.e., horizontal or vertical lines.

(2) Let G be a graph of independence number \alpha whose edges are coloured red or blue. Then the vertices of G can be covered by \alpha monochromatic sub-trees (different trees can have different colours but all the edges of one tree must have the same colour).

While both of these statements do have the common feature of being covering problems, it is not immediately clear that they are in fact both implied by the classical theorem of Kőnig. Ryser’s conjecture is a generalisation of this, which in terms of (1) can be seen as a generalisation to \mathbb{R}^n and in terms of (2) can be seen as a generalisation to r colours. We will focus on (2) here, as the connection to (1) is sort of immediate and I don’t know if the geometric point of view has any clear advantage.

Just as a recap, here is what Ryser conjectured. Given an r-partite r-uniform hypergraph H, we have \tau(H) \leq (r - 1) \nu(H), where \tau is the vertex cover number of a hypergraph and \nu is the matching number.

For a graph G, let tc_r(G) denote the smallest number of monochromatic connected subgraphs whose vertices cover V(G), over all r-edge colourings of G. Here a monochromatic connected subgraph can also be a single vertex. You can in fact take these subgraphs to be the connected components in the graph G_i formed by taking the edges coloured by the i-th colour. If the grah G_i is disconnected, then you get single vertex monochromatic subgraphs. Gyárfás observed that Ryser’s conjecture is equivalent to tc_r(G) \leq (r - 1) \alpha(G), for all graphs G. We will soon see why this is so.

Firstly, to get some familiarity with tc_r(G), observe that the well known fact that for any graph G, either G is a connected graph or its complement G^c is a connected graph, can be stated as tc_2(K_n) \leq 1 for all n. This is because we can look the edges of G and G^c giving us the 2-edge colouring of K_n, where n = |V(G)|. Also note that tc_r(G) \leq n_i, where n_i is the number of connected components in the graph G_i, given an r-edge colouring of G. The notation tc stands for tree cover, which comes from the fact that any connected graph has a spanning tree, and thus instead of covering with monochromatic connected subgraphs we can equivalently cover with monochromatic trees.

Let’s prove that these two conjectures are equivalent. So, say Ryser’s conjecture is true, and let G be a graph with each of its edge given a colour from \{1, \dots, r\}. Denote the subgraph formed by edges whose colour is i by G_i. Construct a hypergraph H as follows. The vertices of H are the connected components of G_i, for i = 1, \dots, r. These vertices are clearly partitioned into r sets, corresponding to the colour i. For each vertex v of G, define a hyperedge of H as the set of all monochromatic connected components that contain v. Each hyperedge has size r since for each i the vertex v is contained in a unique component of G_i. Thus, we have an r-partite r-uniform hypergrah H. See the images below for some concrete example.

Now a matching in H corresponds to a set of vertices in G such that no two of these vertices are together in any monochromatic connected component. In particular, there is no edge between in any two of these vertices. Therefore, \nu(H) \leq \alpha(G). A vertex cover of H corresponds to a collection of monochromatic components that cover all the vertices of G (as they correspond to the hyperedges of H). Therefore, tc_r(G) \leq \tau(H). Since we assumed that Ryser’s conjecture is true, we also have \tau(H) \leq (r - 1) \nu(H). Combining all of these inequalities, we get tc_r(G) \leq (r - 1) \alpha(G).

Now assume that the conjecture of Gyárfás is true, and let H be an r-partite r-uniform hypergraph. Construct a graph G, where each hyperedge is a vertex of G and two vertices are adjacent if the corresonding hyperedges intersect non-trivially. We colour an edge uv of G by the colour i if the hyperedges corresponding to u and v meet each other in the i-th part of the r-partition. If they meet in more than one parts, then pick a colour arbitrarily. Again, for concreteness, we can see that in the first image above starting from the hypergraph on the right hand side we reach the graph on the left hand side. In the second case, however, we get the following:

It follows directly from the definition that independent sets in the graph are in bijective correspondence with matchings in the hypergraph, and thus \alpha(G) = \nu(H). A monochromatic connected component in G, which is in fact a clique, corresponds to the vertex of the hypergraph where all of the hyperedges meet. Therefore, a cover of the vertices of the graph using monochromatic connected components gives rise to a vertex cover of H of the same size, which implies that \tau(H) \leq tc_r(G). Since tc_r(G) \leq (r - 1) \alpha(G) = (r - 1)\nu(H), we get \tau(H) \leq (r - 1) \nu(H), thus proving the equivalence of these conjectures.

This duality can be applied to other variants of Ryser as well. For example, in my work with Shagnik, Patrick and Tibor, we studied the min vertex cover size for the case when the hypergraph is t-intersecting, i.e., any two hyperedges share at least t vertices. This can be seen as determining min tc_r(K_n) with the added condition that in the edge colouring any two vertices are contained in at least t different monochromatic components. Another variant that we studied was that of k-wise t-intersecting hypergraphs. This can be seen as determining the tree cover number of the k-uniform complete hypergraphs, where every set of k vertices are contained in at least t different monochromatic components. In this formulation Király had already solved the problem for k \geq 3, t =1, which we were unaware of, and our work resolves the general case of arbitrary t \geq 1. This complete knowledge of the extremal function for k \geq 3, and very limited knowledge for k = 2 is quite interesting.

One particular advantage of the edge colouring formulation is that we have a richer structure there that can be exploited to study interesting variants that have no natural formulation in terms of hypergraphs. For this, and more, check out the paper of DeBiasio, Kamel, McCourt and Sheats. On the other hand, the hypergraphs formulation somehow feels more natural, and in fact many proofs (for example the r=3 case of Ryser) are best stated in terms of hypergraphs. So in conclusion, it’s a good idea to be aware of both.

Posted in Combinatorics, Extremal Combinatorics, Uncategorized | Tagged , , , , , , , , , , | Leave a comment

Heisenberg groups, irreducible cubics and minimal Ramsey

As I mentioned in a previous post, we recently improved the upper bound on a Ramsey parameter, in collaboration with John Bamberg and Thomas Lesgourgues. My favourite thing about this work is how it ends up using the properties of certain Heisenberg groups associated to generalized quadrangles, and cubic equations, to say something interesting about a problem in graph Ramsey theory. It makes me really happy to see such harmony in mathematics. In this post I will attempt to describe the main construction in our paper without getting into too many details.

As discussed earlier, we want to determine the smallest possible minimum degree of a graph G that has the property that every r-colouring of the edges of G contains a monochromatic clique of size k, and no proper subgraph of G has this property. This smallest degree is denoted by s_r(K_k), and the best known upper bounds on it in various ranges are described in the earlier post. All of these bounds use the equality between s_r(K_{k+1}) and the following packing parameter:

P_r(k) = the minimum n for which there exist K_{k+1}-free pairwise edge disjoint graph G_1, \dots, G_r on a common vertex set V of size n such that for any r-colouring of V there exists a G_i that contains a K_k all of whose vertices are coloured in the i-th colour.

It is known that s_2(K_{k+1}) = P_2(k) = k^2, as proved by Burr, Erdős and Lovász in 1976, but for r > 2 our knowledge of this function is quite limited. (As an exercise, try to find such edge disjoint G_1 and G_2 on a common set of k^2 vertices.)

Since we are just looking at upper bounds for now, one can show P_r(k) \leq n by finding a packing of G_i‘s where each G_i is K_{k+1} free but every collection of n/r vertices in G_i induces a copy of K_k. This works because in any r-colouring of V there must exist a colour i which occurs at least n/r times, and then in the graph G_i we have a K_k all of whose vertices are coloured in the i-th colour. This has been the main approach so far in showing upper bound on s_r(K_{k+1}) for r \geq 3, k \geq 2, and we will be doing the same.

Before finding such a collection of edge disjoint G_i‘s, the first step is to see if we can find even a single graph G on n vertices which is K_{k+1}-free and has the property that any set of n/r vertices in G induce a K_k. For k = 3, we are looking for triangle-free graphs with independence number < n/r. This should remind you of the Ramsey number R(3, t). In fact, such a G on n vertices exists if and only if n < R(3, \lceil n/r \rceil). This connection can in fact be used to prove that s_r(K_3) = \Theta(r^2 \log r) (see Guo and Warnke) by finding not just one triangle-free graph but a packing of triangle-free graphs.

More generally the ideas from the determination of the so-called Erdős-Rogers function were used to prove that for any fixed k, and r large enough, P_r(k) \leq C (\ln r)^{8k^2} r^2 (see Fox, Grinshpun, Liebenau, Person and Szabó). This is pretty satisfactory as far as the realm of k fixed and r \rightarrow \infty is concerned, but in other ranges this is a terrible bound compared to the more general result of FGLPS that shows P_r(k) \leq 8k^6 r^3 for all k \geq 2, r \geq 3. Proofs of all these results rely on finite geometric ideas, combined with the probabilistic method, and that’s exactly the path we take as well.

Here is a general idea to construct these graphs. Consider a partial linear space (a.k.a. a linear hypergraph) that is “triangle-free” in the sense that no three lines pairwise meet each other in three distinct points. Then because of this triangle-freeness the only cliques (of size at least 3) in the graph defined on the point-set with two points adjacent if they are collinear, are those contained inside a line. Therefore, we can try to destroy all of the cliques of size \geq k + 1, while maintaining some structure so that any set of n/r points contains a K_k. This idea, along with a probabilistic argument, can be used to prove the following lemma (which is Lemma 3.1 in our paper) about packings of partial linear spaces:

Lemma 1. Say there exist triangle-free partial linear spaces (P, \mathcal{L}_1), \dots, (P, \mathcal{L}_r), each of order (s, t), such that (P, \cup_{i = 1}^r \mathcal{L}_i) is also a partial linear space. If s \geq 2rk \ln k and t \geq 2k(1 + \ln r), then P_r(k) \leq |P|.

The order (s, t) denotes that every line has s + 1 points on it and every point has t + 1 lines through it. The condition that (P, \cup_{i = 1}^r \mathcal{L}_i) is a partial linear space, is really important, and we overlooked it in our previous version of the paper which led to an error. I would like to thank Simona Boyadzhiyska for pointing this out to me.

The best sources of such triangle-free partial linear spaces, and in fact the only sources I know of, are generalized quadrangles (GQ). One can take any “regular” subhypergraph of a GQ to get these spaces. The main difficulty though is to find a packing of such spaces, as required in Lemma 3.1. In Fox et al. they managed to find such a packing with s = t = q - 1, an arbitrary prime power q, using what is known as the T_2(O) generalized quadrangle (even though they don’t call it that in the paper). Since |P| = q^3 in their packing, we get the upper bound of P_r(k) \leq 8k^6 r^3 by picking a prime q such that rk^2 < q \leq 2rk^2. We have managed to find such a packing with s = q^2 - 1 an t = q - 1, for an arbitrary odd prime power q, that gives us the following improvement:

Theorem 2. P_r(k) \leq C k^5 r^{5/2}.

The construction relies on a group theoretic model of generalized quadrangles, introduced by Bill Kantor in 1980. Consider the Heisenberg group E defined on the set \mathbb{F}_{q^2} \times \mathbb{F}_{q^2} \times \mathbb{F}_{q} by the following group operation (x, z, y) \circ (x', z', y') = (x + x', z + z', y + y' + z^qx' + zx'^q). Note that (a, b) \rightarrow a^q b + ab^q is a bilinear map on \mathbb{F}_{q^2} when identified as the vector space \mathbb{F}_q^2.

Our points will be the elements of E, and thus we have a point set of size q^5. It is known that E contains subgroups A_i, for i \in \mathbb{F}_q, such that (1) |A_i| = q^2; (2) A_i \cap A_j = \{(0, 0, 0\}) for any i \neq j and (3) for any three distinct i, j, k we have (A_i A_j) \cap A_k = \{(0, 0, 0)\}. More explicitly, we have A_i = \{(a, ai, a^{q+1}i) : a \in \mathbb{F}_{q^2}\}. The partial linear space obtained by taking the set of all cosets of A_i' as lines, is of the order (q^2 - 1, q - 1) and the third condition on these subgroups ensures that it is triangle-free. (This is a subgeometry of the Hermitian generalized quadrangle H(3, q^2).)

We obtain a packing of partial linear spaces isomorphic to this one, on the same point set E, by taking a group of automorphisms of E. Each such automorphism \tau defines a new set of subgroups A^\tau_i, i \in \mathbb{F}_q, which have the same properties. What we want from our automorphisms though, because of the condition in Lemma 1 regarding (P, \cup \mathcal{L}_i) being a partial linear space, is that every subgroup in \{A_i : i \in \mathbb{F}_q\} must intersect ever subgroup in \{A_i^\tau : i \in \mathbb{F}_q\} trivially. This turns out to be quite tricky, and we had to be very careful in our choice of automorphisms. In particular, none of the inner automorphisms of E would work. In the end end we managed to find q^2 outer automorphisms of E that give us the required packing of size q^2. Very interestingly, to prove that our packing is good, we had to rely on the following two results on irreducible cubics:

Proposition 1: (Kim-Kim-Yie 2009) There is a bijective correspondence between the set of irreducible polynomials of the form x^3 - cx^2 + c^q x - 1 in \mathbb{F}_{q^2}[x] and the set of irreducible polynomials of the form x^3 + ax^2 + bx + c in \mathbb{F}_q[x] where -b is a non-square in \mathbb{F}_q.

Proposition 2: For every finite field \mathbb{F}_q, and every \lambda \in \mathbb{F}_q, there exists an irreducible cubic x^3 + ax^2 + bx + c where b = \lambda.

This can be shown via the Hansen-Mullen Irreducibility Conjecture (which was proved by Wan for all but finitely many cases, and those small cases were handled by Ham and Mullen). See this recent paper of Tuxanidy and Wang, and the references therein, for more on this conjecture and its generalizations where we look for irreducible polynomials with more than one of its coefficients prescribed.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , , , | 1 Comment

Generalized polygons in extremal combinatorics

Jacques Tits invented generalized polygons to give a geometrical interpretation of the exceptional groups of Lie type. The prototype of these incidence geometries already appears in his 1956 paper, while they are axiomatically defined in his influential 1959 paper on triality. The finite versions of these objects have played an important role in the study of finite simple groups and in the general the theory of permutation groups. In the 70’s and 80’s, generalized polygons also started being studied under the theory of distance regular graphs (and more broadly association schemes). The theory of these geometries has been developed for its own sake as well (see this and this), and there are several long standing open problems about them that have attracted the attention of many mathematicians. In this post I want to discuss the deep connection that generalized polygons have had with extremal combinatorics, specifically with Ramsey theory and Turán type theorems.

Finite projective planes, which are in fact a special case of generalized polygons known as generalized triangles, have always been deeply connected to extremal combinatorics. They give us the optimal examples for several extremal problems on both graphs and hypergraphs. For example, they were behind the construction of Esther Klein, mentioned in the 1938 paper of Paul Erdős, which in fact determines the maximum number of edges in a C_4-free graph on n vertices asymptotically (it’s fun to look at the original number theoretic problem which motivated this construction and also the fact that there is no mention of the term ‘projective planes’ even though the concept is there). They are also present in the early constructions for some special cases of the classical Zarankiewicz problem. These early results form the foundation of extremal graph theory.

Finite generalized quadrangles (GQ) have played a similar role, but I have noticed that many times they don’t get mentioned explicitly by the extremal combinatorics community. There appears to be a gap in knowledge, which is probably because of the slightly more involved constructions of classical GQ’s. The so-called Benson graph of girth 8, that give us the asymptotic result for the Turán number of C_6,  is simply the incidence graph of a GQ. While Benson doesn’t say that explicitly in his 1966 paper, it’d be very surprising if he actually din’t know it as his PhD thesis was called Generalized Quadrangles and (B,N) Pairs. Similarly, the so-called Wenger graph from 1991, is just an induced subgraph of the incidence graph of a well known GQ, even though there is no mention of generalized quadrangles in Wenger’s paper. See my earlier blog post on Wenger graph for more details on this. Once you see that, it should be clear how Kopparty’s graph is related to the same GQ. Thus,  the best known construction of optimal pseudorandom triangle-free graphs can be obtained via generalized quadrangles! This was also shown by Conlon, who started with the collinearity graph of a generalized quadrangle and gave a random construction from it that resulted in almost optimal triangle-free pseudorandom graphs. The basic idea behind these two constructions is in fact the same, just that Kopparty uses an algebraic method to achieve the end goal whereas Conlon uses a probabilistic one (which is why he ends up with an almost optimal graph).

These optimal triangle-free pseudorandom graphs give us the best known constructive lower bounds on the Ramsey number R(3, t). As these can be obtained via generalized quadrangles, we see an application of GQ’s in Ramsey theory. Interestingly, GQ’s can also be used to give us the best known constructive lower bounds on the Ramsey number R(4, t), as shown by Kostochka, Pudlák and Rödl. They have even been used in some hypergraph Ramsey problems to obtain tight lower bounds that beat the probabilistic ones (see Kostochka-Mubayi-Verstraëte). There are some other problems in Ramsey theory where GQ’s have shown up as well, see for example this and this.

In my recent work with John Bamberg and Thomas Lesgourgues on a minimal Ramsey problem we have also used generalized quadrangles. We noticed that a particular group theoretic model of GQ’s, due to Kantor, was perfectly suitable for this problem. I am certain that this idea will be useful in other problems in extremal combinatorics problems as well. I also believe that new interesting applications of generalized hexagons and octagons, as in the recent work of Mubayi and Verstraete (also described here), are just waiting to be found. One of the necessary steps for this is to learn more about these beautiful mathematical objects, and for that I would like to end with a few references.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory, Uncategorized | Tagged , , , , , , , , , , | Leave a comment

Minimal Ramsey problems

Thanks to Anita Liebenau, I have recently been introduced to some very interesting questions in Ramsey theory and I have been working on them for the past few months in collaboration with various people. In my recent joint work with John Bamberg and Thomas Lesgourgues (Anita’s PhD student), that has just appeared on the arXiv, we have  proved an interesting new result in one of these problems. Before I discuss that, let’s look at some background in this post.

The well known fact that in any party of six people either at least three of them are mutual strangers or at least three of them are mutual acquaintances (see this or this), which serves as a nice introduction to Ramsey theory, can be written in graph theoretical terms as follows: any two colouring of the edges of the complete graph K_6 contains a monochromatic K_3. Let’s write this statement as K_6 \longrightarrow (K_3)_2, which we read as “K_6 is 2-Ramsey for K_3”.

We can generalise this notion to any two finite graphs G and H; let’s write G \longrightarrow (H)_r if any r-colouring of the edges of G contains a monochromatic copy of H, and hence G \longarrownot\longrightarrow (H)_r is a shorthand for the statement that here exists an r-colouring of the edges without a monochromatic H. Observe that if G \longrightarrow (H)_r then for any graph G' containing G as a subgraph we have G' \longrightarrow (H)_r, or equivalently, if G \longarrownot\longrightarrow (H)_r then for any subgraph G' of G we have G' \longarrownot\longrightarrow (H)_r. For example, here is a red/blue colouring of K_6 minus an edge that has no monochromatic triangles, thus showing that K_6 - e \longarrownot\longrightarrow (K_3)_2.

This example in fact shows that the graph K_6 is minimal with respect to being 2-Ramsey for K_3, since K_6 is 2-Ramsey for K_3 but no proper subgraph of K_6 has this property.

In general, we say that a graph G is r-Ramsey minimal for another graph H, if G \longrightarrow (H)_r and for any proper subgraph G' of G we have G' \longarrownot\longrightarrow (H)_r. Let’s denote the set of all graphs G that are r-Ramsey minimal for a graph H by \mathcal{M}_r(H). Most of the fundament problems in graph Ramsey theory can be seen as trying to understand this set for various graphs H. The classical result of Ramsey can be rephrased as:

Theorem 1. (Ramsey, 1930) The set \mathcal{M}_r(H) is non-empty for any finite graph H.

The well studied r-colour Ramsey number of a graph H, denoted by R_r(H) is simply the smallest number of vertices among all graphs in \mathcal{M}_r(H). Determining, or even estimating, the number R_2(K_k), which is denoted by R(k), has been one of the central longstanding open problem in combinatorics. The work done towards resolving this has led to significant development in combinatorics.  It was shown by Erdős and Szekeres  in 1935 that R(k) \leq 2^{2k}, and in 1947 Erdős proved the lower bound of R(k) \geq 2^{k/2} using a probabilistic argument that led to the creation of what we now call the probabilistic method in combinatorics. But, despite decades of research we still haven’t been able to improve the base of the exponents in these upper and lower bounds! For the most recent development on the upper bounds see this paper by Ashwin Sah.

Other parameters like the number of edges, chromatic number and maximum degree of graphs in \mathcal{M}_r(H), have also been studied extensively. The question that I have been looking into is determining the smallest minimum degree in \mathcal{M}_r(H), which is denoted by s_r(H). In particular, the new result concerns s_r(k) = s_r(K_k).

For any graph H, the graph K_n with n = R_r(H) is r-Ramsey for H, but it might not be minimal. Still, we can look at a subgraph of K_n which is minimal to conclude that s_r(H) \leq \delta(K_n) = n - 1 = R_r(H) - 1. We can see that this gives us the bound s_2(k) \leq 2^{2k} - 1, using the upper bound on R(k) I mentioned above. In fact, we can’t hope to get any subexponential upper bound on s_2(k) using this easy argument as we also saw that R(k) \geq 2^{k/2}. Quite surprisingly, we do know s_2(k) exactly (which is something that rarely happens in Ramsey theory), and it is far far away from an exponential function!

In 1976, Burr, Erdős and Lovász proved that s_2(k) = (k - 1)^2. Thus, they showed that there exists a graph in \mathcal{M}_r(K_k) with minimum degree equal to (k - 1)^2, and there can’t be any graph in this set with a smaller minimum degree. Determining s_r(k) for r > 2 is still widely open. It’s an instructive exercise to try to come up with a graph in \mathcal{M}_2(K_3) with minimum degree 4. The proof of the upper bound by Burr, Erdős and Lovász uses the existence of certain (very large) graphs, known as signal senders, that are (i) not 2-Ramsey for K_k and (ii) in any such monochromatic H-free colouring of these graphs two specific edges of the graph receive a specific colour pattern. These signal senders are then used to construct 2-Ramsey minimal graphs with the required minimum degree. The same idea has been very fruitful in determining s_r(H) for other graphs H, and some related problems (see this, this, this and this).

Signal senders are also an important ingredient in the work of Fox, Grinshpun, Liebenau, Person and Szabó,  where they reduce the problem of determining s_r(k) to another extremal problem on graph packings. This reduction helped them determine s_r(k) up-to polylogarithmic factors assuming k \geq 3 is a constant and r \rightarrow \infty. The constants C_k in this bound were terribly large, and in particular not polynomial in k. Therefore, they proved the general upper bound of s_r(k) \leq 8 (k - 1)^6 r^3. Pushing their idea further, and using a nice finite geometric and group theoretic ingredient that I haven’t seen used anywhere in extremal combinatorics so far,  we have managed to improve this general upper bound.

Theorem 2. (Bamberg, Bishnoi, Lesgourgues) There exists an absolute constant C such that for all r \geq 2 and k \geq 3, we have s_r(k) \leq C (k - 1)^5 r^{5/2}.

I will discuss the details of this result in the next post, and explain the finite geometric aspect of it which I find super interesting. I am really hopeful that the ideas involved will be useful in obtaining new results in other Ramsey problems.




Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , | 2 Comments

Extending Ryser’s conjecture

In an earlier post, I talked about Ryser’s conjecture on r-partite r-uniform hypergraphs, that has stayed open for all r \geq 4 despite a considerable effort by several mathematicians over a period of 50 years. A bit more effort has been spent on the special case of intersecting hypergraphs, that is, hypergraphs in which every two edges share at least one common vertex. Here the wishful thinking is that this will shed some light on the original conjecture (which one might expect, but there are some surprises), or mathematicians are just following Pólya’s principle: “If you can’t solve a problem, then there is an easier problem you can solve: find it.”. Sadly, we can’t even solve this easier problem; the conjecture is open for intersecting hypergraphs for all r \geq 6. Therefore, it is quite natural to look for simpler problems by adding more restrictions. One of the restrictions is to consider strictly 1-intersecting, a.k.a., linear hypergraphs, that is, every two distinct edges share exactly one common vertex. This seems quite strong, but even in this situation we only know the conjecture to be true for r \leq 9.

Here are two alternate conditions one could impose that are stronger than the hypergraph being intersecting:

  1. (k-wise intersection) any collection of k edges contains a common vertex.
  2. (t-intersection) any two distinct edges have at least t vertices in common.

Note that in (1.) we can have repetitions in the collection of edges, and thus in particular k-wise intersection implies k'-wise intersection for any k' < k.

In this post I want to share some good news regarding these two new settings. In a joint work with my former colleagues, Shagnik Das, Patrick Morris and Tibor Szabó we have completely solved (1.) for all k \geq 3. Interestingly, we used some new results that we proved for (2.) in doing that. To describe our results, let us first define the following function which is central to Ryser’s conjecture:

\mathrm{Ryser}(r, t) = \max \{\tau(H) : H \text{ is a } t\text{-intersecting } r\text{-uniform } r\text{-partite hypergraph}\},

where \tau(H) is the vertex cover number of H, that is, the smallest number of vertices that cover all edges.

Ryser’s conjecture (for intersecting hypergraphs) states that \mathrm{Ryser}(r, 1) \leq r - 1 (note that every edges is a vertex cover, and hence \mathrm{Ryser}(r,1) \leq r is trivially true). If true, this bound will be sharp for infinitely many values of r.

The problem of determining \mathrm{Ryser}(r, t) was introduced a few years back independently by BustamanteStein, and KirályTóthmérész, who made an analogous conjecture for this function that says \mathrm{Ryser}(r, t) \leq r - t. While Ryser’s conjecture is a special case of this one, it is in fact implied by Ryser’s conjecture, that is, \mathrm{Ryser}(r,1) \leq r - 1 ~ \forall r \implies \mathrm{Ryser}(r,t) \leq r - t ~\forall r > t \geq 1. Therefore, one might hope to make progress over this new conjecture for larger values of t, even though t = 1 seems out of reach at the moment. This is what was done, as Bustamante and Stein proved the conjecture for t \geq (r - 2)/2 whereas Király and Tóthmérész proved it for t \geq (r + 1)/4. Moreover, Bustamante and Stein observed that their conjecture is not sharp as \mathrm{Ryser}(5, 2) = 2 instead of being equal to 5 - 2 = 3. This, was the starting point of our work. In fact, I have picture from the exact moment where we started working on proving that the conjecture might not be sharp for other values:


Tibor and Patrick thinking hard about \mathrm{Ryser}(r,t)

This is from the 2018 research group workshop organised by Tibor where I had proposed this problem and the three of us worked on it for two days. We managed to prove that for large enough t, the correct value of \mathrm{Ryser}(r,t) is in fact half of what was conjectured:

Theorem 1. \mathrm{Ryser}(r,t) = \lfloor (r - t)/2 \rfloor + 1, for all t + 1 \leq r \leq 3t - 1.

Once we came back to Berlin from our workshop, Shagnik joined our team:


This is him during the workshop when he was helping us cook Goulash, not realising that helping us on Ryser’s conjecture could also be fun!

He proved that Theorem 1 in fact implies the following result on the function

\mathrm{Ryser}(r, t, k) = \max \{\tau(H) : H \text{ is a } k\text{-wise } t\text{-intersecting } r\text{-uniform } r\text{-partite hypergraph}\}.

Theorem 2. For any k \geq 3 and t \geq 1, \mathrm{Ryser}(r,t,k) = \lfloor (r - t)/k \rfloor + 1.

It is quite interesting that the sharp bound that we prove in Theorem 1 is precisely what was needed for Theorem 2.

In our paper, we prove various other things, which includes showing the bound \mathrm{Ryser}(r,t) \leq r - t for a larger range of values of t, and the same bound for strictly t-intersecting hypergraphs for a much larger range of values of t. We also pose some nice open problems, and make the following bold conjecture (that only a proper subset of us believe):

Conjecture. \mathrm{Ryser}(r,t) = \lfloor (r - t)/2 \rfloor + 1 for all 2 \leq t \leq r - 1.

I can definitely say that despite our beliefs all of us would be happy to see a counterexample to this Conjecture.

We also introduce another variation of the problem where instead of a normal vertex cover, one looks at s-covers, that is, every edge must share at least s vertices with the covering set. Here we make some modest progress and hope that it will inspire some future work.

Posted in Combinatorics, Extremal Combinatorics | Tagged , , , , , , , | 1 Comment

Bounds on Ramsey numbers from finite geometry

In an earlier post I talked about the work of Mubayi and Verstraete on determining the off-diagonal Ramsey numbers via certain optimal pseudorandom graphs, which are not yet known to exist except for the case of triangles. Beyond this conditional result, they also improved the lower bounds on certain cycle-complete Ramsey numbers using generalized polygons. In some cases they even beat the exponent provided by the general F-free process studied by Bohman and Keevash, which

“appears to be the first instance of a graph F containing cycles for which random graphs do not supply the right exponent for r(F, k).”

(I have changed the t from their paper to k for my own convenience). In this post I will talk about the basics of their construction, hoping that these ideas can be used in future to obtain similar results on Ramsey numbers.

One of the fundamental objects that are studied in finite geometry are partial linear spaces, a.k.a., linear hypergraphs. These are point-line geometries with the property that through any two points there is at most one line, or equivalently, these are hypergraphs in which every two hyperedges share at most one common vertex. A partial linear space has order (s, t) if ever line contains exactly s + 1 points and through every point there are exactly t + 1 lines (in hypergraph terminology we have an (s + 1)-uniform (t + 1)-regular hypergraph). For every partial linear space we can define the collinearity graph on the points by making to points adjacent if they lie on a common line (in hypergraph terminology this is sometimes referred to as the skeleton of the hypergraph). Note that the lines of the partial linear space correspond to cliques of size s + 1 in the collinearity graph, and any two such cliques meet in at most a vertex. From now on denote the number of points in the partial linear space by n and the number of lines by m. In several classes of such spaces these numbers are a function of s, t. For example, in the case of a finite projective plane we have s = t and n = m = (s - 1)^2 + (s - 1) + 1. Note that we always have n(t + 1) = m(s + 1).

For a graph F, the Ramsey number r(F, k) can be defined as the largest number of vertices for which there exists a graph that has no subgraphs isomorphic to F, and has independence number < k (the existence of such a maximum is a special case of Ramsey’s theorem). We would like to know the growth of this function as k \rightarrow \infty for various kinds of fixed size graphs F. When F is the complete graph, then this is the classical off-diagonal Ramsey number that we saw in the earlier post.

The basic idea behind the construction that I want to discuss in this post is as follows:

start with a partial linear space for which all copies of F in the collinearity graph are contained inside the cliques corresponding to the lines and then destroy all of these copies of F while keeping the independence number under control.

Let’s start with the independence number. If I is a set of pairwise non-collinear points, then by an easy double count of pairs (p, \ell) where p is a point of I and \ell is a line through p, we get |I|(t + 1) \leq m. Therefore, the independence number is always at most m/(t + 1).

If the graph F contains odd cycles (which is the case we will focus on), then one easy way of getting rid of all the copies of F is to convert each clique of size s + 1, which is where all copies of F lie by our assumption, into a bipartite graph on the same set of vertices by destroying some edges. To be more precise, for each line \ell we take two disjoint subsets A_\ell, B_\ell of the points on \ell, with A_\ell \cup B_\ell = \ell and we define a graph on the set of all points where two points x and y are adjacent if (i) they lie in a common line \ell and (ii) x \in A_\ell, y \in B_\ell or x \in B_\ell, y \in A_\ell. This graph is well defined because we have a partial linear space, that is, no two points can lie in two different lines.

While what we do above ensures that there are no copies of F in our collinearity graph, destroying edges would typically increase the independence number of the graph. What we can show though is that there does exist a choice of bipartition on each line so that the independence number cannot increase by much. We will show this by an easy probabilistic argument.

Here is our probability space. For each line \ell, we choose a bipartition (A_\ell, B_\ell) of the set of s+1 points on \ell uniformly at random (from the set of all bipartitions). Then for any line \ell, the probability that a point x of \ell lies in A_\ell (or B_\ell) is equal to 1/2. We do this independently for each line.

Let I be a subset of points, of size k, and for each line \ell let k_\ell be the number of points of I contained in \ell. The probability that I is an independent set in the graph G obtained by removing all edges within A_\ell and B_\ell in the collinearity graph, for each line \ell, is equal to the probability that for every line \ell the set I \cap \ell is either a subset of A_\ell or B_\ell. Since all these are independent events, with an event corresponding to \ell having probability (1/2)^{k_\ell} + (1/2)^{k_\ell} = 2^{1 - k_\ell}, the probability that I is an independent set is equal to \prod_{\ell} 2^{1 - k_\ell} = 2^{m - \sum k_\ell} = 2^{m - (t + 1)k}. The probability that there exists an independent set of size k is then at most \binom{n}{k} 2^{m - (t + 1)k}, by the union bound. Therefore, to show the complementary event that there exists a bipartition of each line for which the graph defined above has no independent sets of size k, it suffices to show that this probability is < 1. By the inequality \binom{n}{k} \leq n^k = 2^{k \log n}, we see that it suffices to show

2^{k \log n + m - (t + 1)k} < 1.

This is true if and only if k > m/(t + 1 - \log n). Thus, for, say k = \lceil m/(t - \log n) \rceil, we get R(F, k) \geq n as there exists an F-free graph on n vertices with independence number < k.

This shows that to get a good lower bound on R(F, k), what we want from our partial linear space of order (s, t) on n points and m lines is \log n << t and n >> m, beyond the condition we started with of all copies of F being contained in the (s + 1)-cliques.

If you are still with me right now, then here is a concrete example. We know that any triangle in the collinearity graph of a generalized quadrangle (or see this) must be contained in the clique corresponding to a line. We also know that a generalized quadrangle (GQ for short) of order (s, t) has n = (s + 1)(st + 1) points and m = (t + 1)(st + 1) lines. For n >> m, we would thus want s >> t and for t >> \log n we would want t to not be a constant. Considering this, the best choice that we have with us here is s = t^2, and t \rightarrow \infty, as we know that in any GQ one always has s \leq t^2. Such GQ’s are known to exist whenever t is a prime power (see Chapter 5 of my lecture notes). From them we get the lower bound R(3, k) \geq (1 + o(1))k^{5/3}, which is not that great considering we know R(3, k) = \Theta(k^2/\log k). But there is some good news, as Mubayi and Verstraete have shown. If we consider generalized hexagons of order (q^3, q), then we get

R(C_5, k) \geq (1 + o(1))k^{11/8},

and from generalized octagons of order (q^2, q) we get

R(C_7, k) \geq (1 + o(1)) k^{11/9}.

Both of these bounds are better than what one would get from purely probabilistic arguments.

I hope that by using other finite geometries, or by improving the argument above, we would be able to improve bounds on other Ramsey numbers as well.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , , , | 3 Comments

Kopparty’s graph

Alon’s construction of optimal pseudorandom graphs from 1994 is useful for obtaining several interesting combinatorial results in various areas of mathematics, some of which are highlighted in this survey of Noga from last year (also see my previous post). In this post I would like to discuss an alternate construction, due to Swastik Kopparty, that has the same properties and thus can be used instead to prove all of these interesting results. It seems like this construction is not so well known (I learned about it from a conversation with Noga), even though it is a bit simpler to describe and in some sense more general. The construction appears in Kopparty’s lecture notes on Cayley Graphs.

What we are looking for is an infinite sequence of triangle-free d-regular graphs on n vertices with d = \Theta(n^{2/3}) and the second largest eigenvalue equal to \Theta(n^{1/3}).

Let p \neq 3 be a prime (it will become clear later why we are excluding 3), and let \mathbb{F}_{q} be a finite field of characteristic p, that is q = p^h for some integer h. For a subset S of the vector space V = \mathbb{F}_q^3 we define the Cayley graph G = Cay(V, S) as the graph on V with u \sim v if u - v \in S. This graph is undirected if for every s \in S we have -s \in S and loopless if 0 \not \in S, both of which we will assume. Kopparty’s construction, much like Alon’s construction, is a Cayley graph of this form for some suitably chosen S. The property of S that ensures that G is triangle-free is that s_1 + s_2 + s_3 \neq 0 for any s_1, s_2, s_3 \in S (not necessarily distinct). This explains why we assume p \neq 3, as otherwise s + s + s = 0 for every s \in S. Consider the following set:

S = \{(xy, xy^2, xy^3) : x \in T, y \in \mathbb{F}_q^{\times}\}

where T is a subset of \mathbb{F}_q with the property that no three elements of T sum to 0, that is, the same property that we wanted from S. Any such T implies triangle-freeness of G: if x_1(y_1, y_1^2, y_1^3) + x_2(y_2, y_2^2, y_2^3) + x_3(y_3, y_3^2, y_3^3) = 0, then either y_1 = y_2 = y_3 and we have a contradiction because of the property of T or we get a non-trivial linear combination of the rows of a Vandermonde matrix equal to 0, which is also a contradiction (note that 0 \not \in T and y_i \neq 0 for all i).

The specific choice of T will soon become clear, but for now let us focus on the degree and eigenvalues of G. The degree is equal to |S| = |T|(q - 1), while the number of vertices is q^3 (thus we would later require |T| = \Theta(q)). The eigenvalues of Cayley graphs, at least for the case of finite abelian groups, are easy to describe. I recommend chapter 5 of the lecture notes of Luca Trevisan for the basic theory. For a = (a_1, a_2, a_3) \in V, the eigenvalue \lambda_a corresponding to the eigenvector \chi_a : V \rightarrow \mathbb{C} defined by \chi_a(x_1, x_2, x_3) = \omega^{\mathrm{Tr}(a_1 x_1 + a_2 x_2 + a_3 x_3)} satisfies \lambda = \sum_{x \in S} \chi_a(x), where \mathrm{Tr} is the absolute trace function defined by \mathrm{Tr}(\alpha) = \alpha + \alpha^p + \cdots + \alpha^{p^{h - 1}}, and \omega = e^{2 \pi i/p} is a primitive p-th root of unity.

The eigenvalue corresponding to (0, 0, 0) is d = |S|, which is the largest eigenvalue. So let a = (a_1, a_2, a_3) \in V \setminus \{0\}. We have

\lambda_a =  \sum_{y \in \mathbb{F}_q^\times} \sum_{x \in T} \omega^{\mathrm{Tr}(x f(y))},

where f(y) = a_1 y + a_2y^2 + a_3 y^3. We now make the choice of T = \{u \in \mathbb{F}_q : \mathrm{Tr}(u) \in \{-1, 1\}\}. Note that |T| = 2q/p, unless p = 2 when |T| = q/p, T = -T and no three elements of T can sum to 0, as p \neq 3. From the following lemma, whose proof is a fun exercise in algebra, it follows that several of the terms in the expression for \lambda_a are 0.

Lemma. Let z \in \mathbb{F}_q \setminus \mathbb{F}_p. Then for every k \in \mathbb{F}_p we have \sum_{x \in \mathrm{Tr}^{-1}(k)} \omega^{\mathrm{Tr}(x z)} = 0.

Thanks to this Lemma we can write

\lambda_a = \sum_{y \in f^{-1}(\mathbb{F}_p) \setminus \{0\}} \sum_{x \in T} \omega^{\mathrm{Tr}(xf(y))}.

For any fixed k \in \mathbb{F}_p, the number of y for which f(y) = a_1 y + a_2 y^2 + a_3 y^3 = k, is at most 3 since it is an equation of degree at most 3 (and at least $1$) in one variable. Therefore, the number of summands is at most 3p|T|, which implies that |\lambda_a| \leq 3p|T|.

To summarise, we have a triangle-free graph on q^3 vertices which is d-regular with d = 2q(q - 1)/p and has all eigenvalues other than d at most 6 q in absolute value. For fixed p and h \rightarrow \infty, we get the infinite family that we were looking for.

Remark 1: Other choices of T will work as well, for example T = \{u \in \mathbb{F}_q : p/3 < \mathrm{Tr}(u) < 2p/3\}.

Remark 2: This can be generalised to an odd-cycle free construction, just like Alon’s graph can be generalised. Any generalisation to even-cycle free constructions, or to K_s-free, with s \geq 4, would be very interesting!

Posted in Combinatorics, Extremal Combinatorics, Spectral Graph Theory | Tagged , , , , , | 3 Comments

Ramsey numbers from pseudorandom graphs

One of the foundational results in modern combinatorics is Ramsey’s theorem which states that for every positive integers s, t there exists a constant n_0 such that for all n \geq n_0, every 2-coloring of edges of the complete graph K_n has a monochromatic copy of either K_s or K_t. The smallest n_0 for which the statement holds true is then known as the Ramsey number R(s, t); that is, for all n \geq R(s, t) the statement above is true and there exists a 2-coloring of the edges of the complete graph on R(s, t) - 1 vertices that avoids monochromatic K_s and K_t. We know the exact values of R(s, t) only for some small values of s and t. A famous example is R(3,3) = 6, while despite decades of research we only known that 43 \leq R(5,5) \leq 48.

Determining the Ramsey numbers for larger values has been widely open since 1940’s. It has become one of the central problems of mathematics to determine the asymptotics of R(t, t) with t \rightarrow \infty and R(s, t) with s fixed and t \rightarrow \infty. An easy upper bound follows from a nice inductive prove due to Erdős and Szekeres (1947), and it shows that R(s, t) \leq \binom{s + t - 2}{s - 1}. In particular, this implies R(t,t) = O(4^t/\sqrt{t}) and R(s, t) = O(t^{s - 1}) (with the constant depending on s). The best upper bounds that we know are not that far away from these. David Conlon proved in 2016 that there exists a constant c for which

R(t, t) \leq \frac{4^t}{t^{-c \log t/ \log \log t}} .

Ajtai, Komlós and Szemerédi proved in 1980 that for every fixed s there exists a constant c_s such that

R(s, t) \leq c_s \frac{t^{s - 1}}{(\log t)^{s - 2}}

For lower bounds, Erdős proved using a probabilistic argument that

R(t, t) \geq (1 - o(1)) \frac{t}{\sqrt{2} e} \sqrt{2}^t

and the only improvement (since 1947) in that lower bound has been by a factor of 2.

In case of R(s, t), with s fixed, we know the following lower bound by Bohman and Keevash from 2010 that is again proved probabilistically by carefully analysing a natural random K_s-free process:

R(s, t) \geq c'_s \frac{t^{(s + 1)/2}}{(\log t)^{(s + 1)/2 - 1/(s - 2)}}.

Ignoring the log factors in the bottom, we have

c'_s t^{(s + 1)/2 - o(1)} \leq R(s, t) \leq c_s t^{s - 1 - o(1)} .

Any improvement in the powers of t here would be ground-breaking. What I want to talk about in this post is a recent work of Dhruv Mubayi and Jacques Verstraete that offers us a new way of possibly improving the lower bound. Briefly, they show that if the main conjecture on pseudorandom graphs that I mentioned in an earlier post is true, then R(s, t) \geq c''_s t^{s - 1 - o(1)} for some constant c''_s. Therefore, the existence of the optimal K_s-free pseudorandom graphs would give us the correct polynomial in R(s, t).

Here is how their proof goes. Firstly, observe that by taking one of the colour classes to define the edges of a graph, we need to construct an n vertex graph G that has no K_s as subgraphs and has independence number < t to show R(s, t) \geq n + 1. The main result that they prove is as follows:

Theorem 1. Let If there exists a K_s-free (n, d, \lambda)-graph, and n is large enough, then for t = \lceil 2n \log^2 n/d \rceil, we have

R(s, t) \geq c_s \frac{n}{\lambda} \log^2 n .

for some constant c_s.

The proof of Theorem 1 is not too hard if you are comfortable with the probabilistic method and you know that you should use Theorem 2.1 from Alon-Rödl. It can be summarised as: “pick each vertex uniformly at random with probability \log^2 n/(2e^2 \lambda) and in the induced subgraph remove one vertex from each independent set of size t”.

If the conjecture regarding K_s-free pseudorandom graphs is true, then for every integer s \geq 3 there exists an infinite sequence of (n, d, \lambda)-graphs with d = \Omega(n^{1-1/(2s - 3)}) and \lambda = O(\sqrt{d}). Consider such a graph G for n large enough. From t = \lceil 2n \log^2 n/d \rceil and n = O(d^{(2s - 3)/(2s - 4)}) we get d = \Omega(t^{2s - 4}/(\log t)^{2(2s - 4)}). Since \lambda = O(\sqrt{d}), from Theorem 1 we have R(s, t) \geq c_s \frac{n}{\lambda} \log^2n = c_s t \frac{d}{2 \lambda} = \Omega(t \sqrt{d}), and from the previous expression for d in terms of t we get

R(s, t) = \Omega \left(\frac{t^{s - 1}}{\log^{2s - 4} t} \right).

So, all we need to do is construct these K_s-free pseudorandom graphs. The best construction that we have so far (discussed here) has edge density \Theta(n^{-1/(s - 1))}. Sadly, the lower bound that this construction gives us on R(s, t) is c_s t^{s/2}/(\log t)^{s - 2}, which is worse than the probabilistic bound of Bohman and Keevash. In general, if we can construction K_s-free optimally pseudorandom graphs with density n^{-1/f(s)}, we will get the bound

R(s, t) \geq c_s \frac{t^{(f(s) + 1)/2}}{\log^{f(s) - 1} t}

Therefore, even a construction with density n^{-1/(s + 1)} (or in fact n^{-1/(s+\epsilon)}) would already break the Bohman-Keevash lower bound! That should be doable, right? I leave it as an open problem to the readers.

UPDATE 15th Oct 2019: Xiaoyu He has shown that the existence of these optimal K_s-free graphs for all s would also determine the multicolour Ramsey numbers R(s_1, \dots, s_k, t) with s_i fixed for all i and t \rightarrow \infty, up-to a poly-logarithmic factor.

Posted in Combinatorics, Extremal Combinatorics | Tagged , , , , , , | 4 Comments

Spectral proofs of theorems on the boolean hypercube

In a recent breakthrough Hao Huang proved the sensitivity conjecture, that had remained open for past 30 years despite some serious effort from various computer scientists and mathematicians. The proof can be described in a single tweet (if you have the basic background), and it has appeared on the ELI5 page of reddit. The main ideas behind the proof are from spectral graph theory, and one of the starting points (see for example this comment) is the following bound on the independence number of a graph due to Cvetković:

Lemma 1 (Inertia Bound). Let G be a graph on N vertices and let \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_N be its eigenvalues. Let N^+ be the number of non-negative eigenvalues of G, and N^- the number of non-positive eigenvalues (counted with multiplicity). Then \alpha(G) \leq \min \{N^+, N^-\}.

The proof of this Lemma is a nice application of a classical result in mathematics known as Cauchy’s interlacing theorem (I have talked about eigenvalue interlacing earlier in this blog), and I leave it to the reader to fill in the details*. This spectral upper bound on the independence number of a graph is relatively less known than Hoffman/Delsarte bound which says \alpha(G) \leq (-\lambda_N/(\lambda_1 - \lambda_N)) N. Since many combinatorial problems can be rephrased as finding a nice upper bound on the independence number of some graph, such spectral results can be quite useful, when one has enough structure to say something about the eigenvalues. Until recently, I didn’t see any applications of the inertia bound in extremal combinatorics, whereas Hoffman’s bound is widely used; for example, in proving Erdős-Ko-Rado theorems. Interestingly, the classical Erdős-Ko-Rado theorem can also be proved using the Lemma 1 (as I learned from Hao Huang’s talk at our combinatorics seminar).

One interesting observation made by Huang regarding Lemma 1 (which was already known) is that instead of using the eigenvalues of the adjacency matrix of the graph, we could use any pseudo-adjacency matrix and still arrive at the same conclusion. A pseudo-adjacency matrix of a graph G on N vertices is any N \times N real symmetrix matrix A with the property that A_{ij} = 0 whenever the i-th vertex is non-adjacent to the j-th vertex (in an adjacency matrix we also require A_{ij} = 1 for every adjacent pair). If N^+, N^- are the number non-negative/positive eigenvalues of A, then from the proof of Lemma 1 using interlacing we again get \alpha(G) \leq \min \{N^+, N^-\}.

The flexibility of using the pseudo-adjacency matrix is crucial to Huang’s breakthrough result. In fact, before the proof of the sensitivity conjecture, Huang (in collaboration with Oleksiy Klurman and Cosmin Pohoata) used the same method to give a spectral proof of another theorem on the hypercube, known as Kleitman’s theorem, which is what I want to discuss in this blog post. In particular, I want to give a slightly simplified proof due to Aditya Potukuchi and Amey Bhangale, that my friend Aditya shared with me a few days after Huang’s sensitivity paper appeared on arXiv as an application of his methods, without realising that a spectral proof of this theorem had already appeared in an earlier paper of Huang. Here is the theorem that we will prove:

Theorem 2 (Kleitman). Let S be a set of vectors in \{0, 1\}^n such that the hamming distance between any two vectors in S is at most d. Then

\displaystyle |S| \leq \begin{cases} \sum_{i = 0}^t \binom{n}{i}, \text{ if } d = 2t \\ 2(\sum_{i = 0}^t \binom{n - 1}{i}), \text{ if } d = 2t + 1\end{cases}.

(This can be seen as the discrete analog of the isodiametric theorems in real geometry where one wants to find the largest volume of a set of points with a given diameter.)

To use Lemma 1 above, we will naturally consider the graph G on Q_n = \{0, 1\}^n where two vectors are adjacent if the hamming distance between them is at least d + 1, because then any independent set in this graph corresponds to such a set S in the theorem. What are the eigenvalues of the graph G?

The boolean hypercube, Q_n, and the graphs defined on it using hamming distance are well studied objects. One way of computing the eigenvalues of G is to look at it as the Cayley graph Cay(\mathbb{Z}_2^n, \mathbb{Z}_2^n \setminus B_d^n), where B_d^n is the Hamming ball of radius d, that is, all vectors whose Hamming weight is at most d. From the basic theory of Cayley graphs, it then follows that G has n + 1 distinct eigenvalues \lambda_0, \lambda_1, \dots, \lambda_n with

\displaystyle \lambda_i = \sum_{k = d + 1}^n \mathcal{K}_k^n(i)

appearing with multiplicity \binom{n}{i}, where \mathcal{K}_k^n(i) is the Kravchuk polynomial evaluated at i, defined as

\displaystyle \mathcal{K}_k^n(x) = \sum_{j = 0}^k (-1)^j \binom{x}{j} \binom{n - x}{k - j}.

Note that \mathcal{K}_k^n(x) is a polynomial of degree k. For ease of notation, we will skip the superscript from now on and just write \mathcal{K}_k instead. These polynomials have several nice properties, and we will use the following symmetry (or should it be called anti-symmetry?):

\displaystyle \mathcal{K}_k(i) = (-1)^i \mathcal{K}_{n - k}(i)

which lets us see the evaluation of a degree k polynomial at i as the evaluation of a degree n - k polynomial at i, up-to a factor of (-1)^i. This fact will come in handy later.

Now if you apply Lemma 1 to G using these eigenvalues, you will realise (after some calculation) that it does not give the required bound (what bounds do you get?). That’s where pseudo-adjacency matrices come into picture. We will define a weight function w:[n] \rightarrow \mathbb{R}, such that every vector with i non-zero values is given the weight w(i), and w(i) = 0 for all i \leq d. From this weight function we will get the pseudo-adjacency matrix A of G where A_{ij} = w(k) if the Hamming distance between the i-th and the j-th vectors is equal to k. Again, from the basic theory of Cayley graphs (in particular, calculating Fourier coefficients), or more directly (Lemma 2.3 here), we get that A has n + 1 distinct eigenvalues,

\displaystyle \lambda_i = \sum_{k = d + 1}^n w(k) \mathcal{K}_k(i)

with multiplicity \binom{n}{i}, with i = 0, 1, \dots, n. Now, we just have to choose the function w cleverly so that either N^+, or N^-, is at most what we need it to be. Here is when Huang, Klurman and Pohoata make the clever choice of w(k) = \binom{\lfloor (k - 1)/2 \rfloor}{t}, for the case of d = 2t. For the odd case, d = 2t + 1, there is a different choice, but from now on we will only stick to the even case of Theorem 2. From this choice, they get the following three properties,

  • (-1)^i \lambda_i > 0, for i = 0, \dots, t
  • \lambda_{n - i} = \lambda_{i + 1}, for i = 0, \dots, t - 1
  • \lambda_{t + 1} = \lambda_{t + 2} = \dots = \lambda_{n - t} = (-1)^{t + 1},

from which the result then follows by an easy analysis. The way they prove these properties is using generating functionology, and it is not immediately clear how the choice of w(k) came about. What Aditya and Amey do instead, is start with the properties that one wants from the eigenvalues, and then show the existence of the function w, without ever computing it explicitly!

Here is how this is done. Observe that using the symmetric property of Kravchuk polynomials, we can write

\displaystyle \lambda_i = \sum_{k = 2t + 1}^n w(k) \mathcal{K}_k(i) = (-1)^i \sum_{k = 2t + 1}^n w(k) \mathcal{K}_{n - k}(i)

In the second formulation, we can see \lambda_i as (-1)^i multiplied with a linear combination of polynomials of degrees n - 2t - 1, n - 2t - 2, \dots, 1, 0, evaluated at i. These polynomials are in fact linearly independent, and hence any polynomial f(x) of degree at most n - 2t - 1 can be written as a linear combination of them. Therefore, we can choose a nice f(x), for which the eigenvalues will be \lambda_i = (-1)^i f(i), and then get the weights w(k)‘s automatically.

Pick f(x) as the unique monic polynomial whose roots are t + i + 0.5, for i = 1 to n - 2t - 1. WLOG, assume that f(0) > 0 (if not, then we can take -f). Since f is a polynomial, the sign of its values oscillates between the roots. In particular, this shows that \lambda_i = (-1)^i f(i) has the same sign for i = t + 1, \dots, n - t, that is, \lambda_i > 0 if t is odd (since f(0) and f(t+1) have the same sign), and \lambda_i < 0 if t is even, for all of these i‘s. In the former case, we can carefully count the number of negative eigenvalues with multiplicities, and in the latter case we can count the number of positive eigenvalues with multiplicities, giving us Theorem 2.

Note that as far as the signs of \lambda_i‘s are concerned, this trick of picking the polynomial f explicitly, and then getting the weights, gives us the same three properties as we had earlier where Huang, Klurman and Pohoata made the explicit choice of the weight function. For the case of Theorem 2 when d = 2t + 1, we have to do something different, and I leave it to the reader to figure that case out.

For some open questions that can potentially be solved using these methods, check out Section 5 of the Huang-Klurman-Pohoata paper.

(*) Fedor Petrov has given a “low-level” proof that avoids using the interlacing theorem (congrats to Fedya for his new blog!).

For discussions on Huang’s proof of the sensitivity conjecture, see the following blog posts: Terry Tao, Gil Kalai, Scott Aaronson, Ken Reagan.

Posted in Combinatorics, Extremal Combinatorics, Spectral Graph Theory | Tagged , , , , , , , | 5 Comments