Bilinear forms and diagonal Ramsey numbers

The recent breakthrough of Conlon and Ferber has shown us that algebraic methods can be used in combination with probabilistic methods to improve bounds on multicolour diagonal Ramsey numbers. This was already shown for the off-diagonal Ramsey numbers by Mubayi and Verstraete last year, as discussed here. Sadly, for two colours we still don’t have any improvement over the classic probabilistic bound proved by Erdős in 1947 (except for some constant factors in front of the exponential function). More concretely, we haven’t been able to prove R(t, t) > c^t, for any c > \sqrt{2} since 1947, despite considerable effort by various mathematicians. Therefore, it might sound reasonable to think that R(t, t) is equal to 2^{t/2 + o(t)}, but the exponential improvement over the (purely ) probabilistic lower bounds on multicolour Ramsey numbers should cast some doubt on that.

In an end remark Conlon and Ferber mention that the lower bound of 2^{t/2 + o(t)} can be proved in a new way, using the arguments of their paper, where we have a mix of algebra and probability. This sounds like a very promising approach for further improvements as well. In this post I want to explore that, and attempt to explain why we will need some really new ideas to improve the lower bound. In the process, we will see some basic theory of bilinear forms, which is crucial to the arguments. For a more in depth exploration of the topic, see these notes of Keith Conrad and Chapter 3 of the book by Simeon Ball.

Given a vector space V over a field F, a bilinear form is a function b: V \times V \rightarrow F, such that for any fixed u, v \in V, both b(u, y) : V \rightarrow F and b(x, v) : V \rightarrow F are linear functions. This is an abstraction of the dot product, which is so crucial in understanding the geometry of the real space \mathbb{R}^n. Inspired by that, we define a vector u to be orthogonal to a vector v if b(u, v) = 0. There is a certain lack of symmetry in the definition here, namely, b(u, v) = 0 doesn’t necessarily imply b(v, u), or in other words, we could have u orthogonal to v but v non-orthogonal to u. To fix that we introduce the notion of reflexivity and call a bilinear form reflexive if b(u, v) = 0 implies b(v, u) = 0 for all u, v \in V. Once have have a reflexive bilinear form we can talk about orthogonality of two vectors. From now on, b will denote a reflexive bilinear form.

While this orthogonality defined in terms of reflexive bilinear forms might seem like a very general notion, it ends up being restricted to only two cases. A fundamental result, attributed to John von Neumann, states that for any vector space V, a reflexive bilinear form b on V is of one of the following two types:

  1. Symmetric: b(u, v) = b(v, u) for all u, v \in V.
  2. Alternating: b(u, u) = 0 for all u.

For example, the dot product on \mathbb{R}^n defined by (x_1, \dots, x_n) \cdot (y_1, \dots, y_n) = x_1y_1 + x_2y_2 \cdots x_ny_n is a symmetric bilinear form, whereas the function b(x, y) = x_1y_2 - x_2y_1 + x_3y_4 - x_4y_3 + \cdots x_{n-1} y_n - x_n y_{n - 1} (assuming n to be even) is alternating. These examples are in fact examples of what we call non-degenerate bilinear forms, i.e., form with respect to which no vector is orthogonal to every vector of the vector space.

One interesting property of alternating forms is that they are skew symmetric. Indeed, we have b(u + v, u + v) = 0 for any u,v and thus b(u,v)=-b(v,u). If the characteristic of the underlying field F is even, then this shows that every alternating form is in fact symmetric. The following lemma gives us a relation in the other direction as well.

Lemma 1: If b is a symmetric bilinear form on a vector space V over a field F of characteristic 2, then V=S\oplus W where \dim S \leq 1 and the restriction of b on W is an alternating form.

For example, if we take b(x, y)=x_1y_1 + \cdots x_n y_n over \mathbb{F}_2^n, then we can take W = \{u \in \mathbb{F}_2^n : b(u,u) = 0\}, which is equal to the hyperplane \{u\in \mathbb{F}_2^n:\sum u_i = 0\}, since a^2=a for any a \in \mathbb{F}_2. The form restricted to W is alternating by the definition of W.

One of the basic notions associated to a bilinear form is that of an isotropic subspace, which shows up in the construction of Conlon and Ferber as well, as discussed in the previous post. An isotropic subspace is a subspace S such that b(u, v) = 0 for any two, not necessarily distinct, u,v in S (such a subspace is usually a totally isotropic but we’d ignore that here). For an arbitrary subset A of V, if we denote the set \{u\in V:b(u,v)=0\forall v\in A\} by A^\perp, then S being isotropic is the same as saying S \subseteq S^\perp. Note that A^\perp is always a vector subspace. The following is another fundamental result whose corollary is what we used in the previous post.

Lemma 2: Let b be a non-degenerate reflexive bilinear form over a finite dimensional vector space V. Then for any subspace S, we have \dim S + \dim S^\perp = \dim V.

Corollary 3: An isotropic subspace with respect to a non-degenerate (reflexive) bilinear form over an n dimensional vector space has dimension at most n/2.

The max dimension of an isotropic subspace is known as the Witt index of the bilinear form. It can be shown that every non-degenerate alternating form has Witt index exactly equal to n/2 (see Theorem 3.8 in Simeon Ball’s book). This also implies that there are no non-degenerate alternating forms over odd dimensional vector spaces.

Let us now look at the graph that plays the main role in the construction of Conlon and Ferber. Let G be the orthogonality graph on the set of isotropic vectors in \mathbb{F}_2^t with respect to the bilinear form b(x, y) = x_1y_1 + \cdots x_t y_t, with t even. Then as we saw earlier, the set S of isotropic vertices form a t-1 dimensional vector subspace on which b is an alternating form. Moreover, this alternating form is degenerate because (1, \dots, 1) is orthogonal to every other vector in S since t is even. As there is no other isotropic vector which is orthogonal to every other isotropic vector, the set vector space S can be decomposed as S = \langle (1, \dots, 1) \rangle\oplus \mathbb{F}_2^{t\text{-}2} such that the form induced on \mathbb{F}_2^{t\text{-}2} is a non-degenerate alternating form. Therefore, we should be able to understand this graph starting from the non-degenerate alternating form. Let’s do that and start over.

Let t be an even integer, and b a non-degenerate alternating form on \mathbb{F}_2^{t\text{-}2}. Let G be the orthogonality graph on these vectors, that is, u \sim v if b(u, v) = 0. We can notice the following property.

Lemma 4: There are no independent sets of size t in G.
Proof. Let x_1, \dots, x_t be an independent set, and say \sum a_i x_i = 0. Then by applying b on this sum and each x_i we get that (a_1, \dots, a_t) is in the null space of J - I. But J - I has eigenvalues t-1 and \text{-}1, which are both non-zero (since t is even), and thus J - I is non-singular. Therefore, a_i = 0 for all i, and these vectors form an independent set in the vector space. This is a contradiction since the dimension of the vector space is strictly less than t.

While there can’t be any large independent sets, G certainly has really large cliques. Any (totally) isotropic subspace of dimension (t - 2)/2 gives rise to a clique of size 2^{t/2 - 1}, and in fact these are the largest possible cliques. So, trivially G does not give any good lower bound on the Ramsey number R(t, t). Still, we can apply the basic idea of Conlon and Ferber (which was also the main idea of Mubayi and Verstraete) and take a random induced subgraph of G, hoping that it can destroy all large cliques while giving us a graph of reasonable size. (Note that any induced subgraph will also be free of any independent sets of size t.) For that, let’s first count the total number of cliques of size k in G.

Every clique in G spans an isotropic subspace of \mathbb{F}_2^{t-2}, which has a certain rank r. The number N_{k,r} of cliques of size k and rank r can be bounded from above by 2^{t-2} 2^{t - 3} \cdots 2^{t - 2 + r -1} 2^{(k - r)r}, where the first r terms correspond to choosing r linearly independent members of the clique and then we pick the remaining k - r elements from this r-dimensional subspace. By simplifying, we have

N_{k, r} \leq 2^{(t+ k - 3(r+1)/2)r}.

This number is increasing in r from r = 1 until r = (t + k)/3 - 1, and then decreases till r = 2(t + k)/3 -1, where it finally becomes 1 (unless 2(t + k)/3 - 1 > t - 1, which is a trivial case that we’ll ignore). But, we know that r \leq t/2 from Corollary 3, and hence \max_{r} N_{k, r} = N_{k, t/2 - 1} for (t + k)/3 - 1 \geq t/2 - 1, i.e., for k \geq t/2, and \max_{r} N_{k, r} = N_{k, (t + k)/3 - 1} for k \leq t/2. While we are interested in N_{t} = \sum_{r = 1}^{t/2 - 1} N_{t,r}, we will soon need the estimates for smaller cliques as well.

Repeating the probabilistic argument of Conlon and Ferber, we see that if we sample each vertex independently with probability p = n2^{-t + O(1)}, then there exists a subset of vertices of size n inducing no clique of size t as long as we can ensure that n^{t} 2^{-t^2 + o(t^2)} N_t < 1/2. Since N_t \leq  \frac{t}{2} \max_{r} N_{t,r} = \frac{t}{2} N_{t, t/2} \leq 2^{5t^2/8 + o(t^2)}, we can take n = 2^{3t/8 + o(t)}, which then gives us the bound R(t, t) > 2^{3t/8 + o(t)} on the Ramsey number R(t,t). This is worse than the purely probabilistic bound of 2^{t/2}. Can we do anything better?

Here’s an idea that is extracted from the paper of Conlon and Ferber. We can look at cliques, and independent sets, of size smaller than t in this graph. Perhaps for some 0 < a < 1, we can get good lower bound on R(at, at) by taking a random induced subgraph of G that does not contain any cliques of independent sets of size at.

For that, let’s count the number I_k of independent sets of size k in G. For k even, the proof of Lemma 4 tells us that any such set will also be an independent set of vectors. Therefore, I_k is at most \prod_{i = 0}^{k-1} 2^{t - i - 2} or 2 \prod_{i = 0}^{k-2} 2^{t - i - 2}, depending on whether k is even or odd. In any case, for k = at, we have I_{at} \leq 2^{(2a - a^2)t^2/2 + o(t^2)}.

Recall that we had N_{at} \leq 2^{(1+a)^2 t^2/6 + o(t^2)} for a \leq 1/2 and N_{at} \leq 2^{(4a+1)t^2/8 + o(t^2)} for a \geq 1/2. Therefore, we see that asymptotically N_{at} is always bigger than I_{at}. Thus, the probabilistic argument from before gives us a subset of n vertices on which the induced graph has no cliques of size at and no independent sets of size at as long as n^{at} 2^{-at^2 + o(t^2)} (N_{at} + I_{at}) < 1/2, and since N_{at} \geq I_{at} (for t large enough), it suffices to pick an n for which n^{at} 2^{-at^2 + o(t^2)} N_{at} < 1/2. We can achieve this by taking n = 2^{cat + o(t)} where c = (4a - 1)/(8a^2) for a > 1/2 and c = (4a - a^2 - 1)/(6a^2) for a \leq 1/2.

Sadly, both of these values of c are \leq 1/2, with equality only at a = 1/2. Therefore, we are always worse than the basic probabilistic bound except for the case of a = 1/2 where we can show R(t/2, t/2) > 2^{t/4 + o(t)}. This asymptotically matches the probabilistic bound but we arrive at it via a different (and much more complicated) construction. For improvements, using this easy method, it looks like we’ll have to consider some other graphs. This new approach definitely gives me hope that one can find graphs with a better control on the total number of cliques and independent sets, which can then be used to give an exponential improvement in the lower bound on two-colour diagonal Ramsey numbers.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Ramsey Theory | Tagged , , , , , , , | Leave a comment

Covering the binary hypercube

A finite grid is a set S = S_1 \times \cdots \times S_n \subseteq F^n, where F is a field and each S_i is a finite subset of F. The minimum number of hyperplanes required to cover S can easily be shown to be \min |S_i|, with the hyperplanes defined by x_1 - \lambda = 0, for \lambda \in S_j providing such a cover (assuming j is the index minimising |S_i|). Interestingly, if we forbid the hyperplanes to pass through one specific point of S, say \vec{a} = (a_1, \dots, a_n), and then want to cover everything else, then the minimum number of hyperplanes required is equal to \sum_{i = 1}^n (|S_i| - 1). This number can be pretty large compared to the previous one.

This is the famous result of Alon and Füredi from 1993 which was proved in its full generality using the polynomial method. They were answering a question of Komjáth, which was for the special case of S = \{0, 1\}^n, but this same result had in fact been proved already for another special case in the late 70’s. Jamison, and Brouwer-Schrijver, proved this bound for F = \mathbb{F}_q and S = \mathbb{F}_q^n. They were motivated by a question in finite geometry about the blocking number of an affine space, asked by Jean Doyen, which is equivalent to the dual statement. Their proofs were also algebraic and in fact they are one of the first instances of what we now call the polynomial method (see this for a history). In finite geometry circles, this method was sometimes even referred to as the Jamison method.

In the 90’s, Bruen proved an interesting generalisation of the Jamison/Brouwer-Schrijver result, motivated by problems in finite geometry, where the (multi)set of hyperplanes covered every point except \vec{a} \in \mathbb{F}_q^n at least k times, while not covering \vec{a} at all. He proved the lower bound of (n + k - 1)(q-1) and asked about the sharpness of this bound. The sharpness was addressed in some later papers, for example by Simeon Ball and then Corrado Zanella. In 2009, Ball and Serra generalised this covering with multiplicities result to the case of arbitrary finite grids by proving a general lower bound on the degree of polynomials that vanish everywhere on the grid with a certain multiplicity except at one point. Recently, there has been a lot of interesting activity in this hyperplane covering problem with multiplicities, for the special case of binary hypercubes. This is what I want to focus on in this blog post. I also want to mention some interesting new results over the binary field \mathbb{F}_2 that I have proved in collaboration with Simona, Shagnik and Tamás, which nicely complements this recent work.

Let’s first write down the main extremal functions that we are interested in. Given a field F and positive integers n, k, s, with s \leq k - 1, define a (k, s)-cover of \{0, 1\}^n \subseteq F^n as a multiset \mathcal{H} of hyperplanes with the property that there are exactly s element of \mathcal{H} passing through the origin and for every point \vec{p} in \{0, 1\}^n \setminus \{\vec{0}\} there are at least k element in \mathcal{H} passing through \vec{p}. We then define:

h(n, k, s) = \{\min |\mathcal{H}| : \mathcal{H} \text{ is a } (k, s)-\text{cover of } \{0, 1\}^n\};

g(n, k) = \min_{s} h(n, k, s);

f(n, k) = h(n, k, 0).

Note that g(n, k) has the following natural reformulation: it is the smallest number of hyperplanes that cover every non-origin point at least k times but only cover the origin at most k - 1 times. All of these function are the same for k = 1 and equal to n by the result of Alon and Füredi. It is also an easy exercise to show that f(n, 2) = g(n, 2) = n + 1 for any field F. I have hidden the field F in the notation of these functions, and so I’ll make it clear what field we are talking about before making any statement.

Last year, Clifton and Huang studied the function f(n, k) when F = \mathbb{R} and proved the following results:

f(n, 3) = n + 3,

n + k + 1 \leq f(n, k) \leq n + \binom{k}{2}, for all k \geq 4 and n \geq 3, and

f(n, k) = (1 + 1/2 + \cdots + 1/n + o(1))k for fixed n and k \rightarrow \infty.

Moreover, they conjectured that we should have f(n, k) = n + \binom{k}{2} for every fixed k \geq 3 and n large enough as they couldn’t find any better construction than the one where you take the n hyperplanes defined by x_i = 1 and k - t copies of hyperplanes \sum_{i = 1}^n x_i = t, for 1 \leq t \leq k - 1. This seems to be a very challenging problem.

Very recently, Sauermann and Wigderson have managed to improve the Clifton-Huang lower bounds on f by proving a general result about polynomials vanishing with multiplicities on binary hypercubes. In particular, they prove the following results:

  1. h(n, k, k - 1) = n + 2k - 2 for any field F.
  2. g(n, k) = n + 2k - 3 for all n \geq 2k - 3 if we have \mathrm{char}~F = 0.

Note that since f(n, k) \geq g(n, k) by the definition of these function, the second statement improves the lower bound of Clifton and Huang for n \geq 2k - 3 (as the field \mathbb{R} has characteristic 0). They also ask about extending their results to fields of positive characteristic and give some examples over finite fields that shows how their proof fails to give the same bound on g(n, k) (or f(n, k). In particular, over \mathbb{F}_2 they show that one can’t prove that f(n, 4) \geq n + 5 using their method.

Last year, while I was still in Berlin, I suggested studying these problems in our group workshop and we managed to obtain some interesting preliminary results for the case of \mathbb{F}_2 with my colleagues. We now have a paper, which is soon going to be posted on the arXiv, where we prove the following results for F = \mathbb{F}_2.

Theorem 1. For n > 2^k, g(n, k) = n + 2k - 3. For k \geq 2^{n - 2} , g(n, k) = 2k - \lfloor k/2^{n-1} \rfloor.

Our proofs don’t involve any polynomials. We simply use some combinatorial arguments where one crucial ingredient in our proof is a coding theoretic interpretation of the function f(n, k) that implies the following:

Theorem 2. For all k \geq 2 and n \geq 1, f(n, k) \geq n + \lfloor (k -1)/2 \rfloor \log (2n/(k-1)).

(One can also show the upper bound of f(n, k) \leq n + (k - 1) \log(n/(k-1)).)

Another important ingredient is the following statement, which we prove via induction, but it in fact follows from the work of Sauermann and Wigderson as well:

Lemma 3. For n, k \geq 1, we have h(n, k, k - 1) = n + 2k - 2.

Moreover, because our methods are combinatorial, we can generalise to covers using smaller dimensional subspaces, which is something that’s not at all easy to do in the polynomial method. If we denote by g(n, k, d) the function where the cover involves (n - d)-dimensional affine subspace of \mathbb{F}_2^n, then we show the following:

Theorem 4. For n > 2^{k2^d -d - k + 1}, g(n, k, d) = k2^d + n - d - 2. For k \geq 2^{n - d - 1}, g(n, k, d) = k2^d - \lfloor k/2^{n-d} \rfloor.

This result in fact generalises the binary case of Jamison’s theorem who proved that for F = \mathbb{F}_q we have g(n,  1, d) = q^d - 1 + (n - d)(q - 1). (though there is no condition on n being large enough in Jamison’s theorem)

Sadly, our methods don’t give us anything new for the case of F = \mathbb{F}_q when q > 2. I believe that for larger fields one needs to improve upon the existing algebraic methods. Also, over \mathbb{F}_2 while we know that the threshold k \geq 2^{n - 2} in Theorem 1 is tight, there is a lot of room for improvement in the bound n \geq 2^k. We suspect that except for some small exceptions (that arise due to the Golay code), the correct threshold should be k + 1. More specifically, we believe that g(n, k) = n + 2k - 3 for n \geq \max\{k+1,13\}. We do show that g(k, k) \leq 3k - 4 for all k, and thus this threshold of k + 1, if true, would be tight.

I would love to see a proof (or a counterexample) to the following simpler statement,

g(n, k) = n + 2k - 3 for all n \geq poly(k).

It will also be very interesting to determine what happens between these extreme ranges of parameters, for example when n \sim k^{c} for some constant c < 1. We leave these as open problems and we are looking forward to more new developments in this topic which can help solve them.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Polynomial Method | Tagged , , , , , , , , , , | 1 Comment

Improved lower bounds for multicolour diagonal Ramsey numbers

Big news in combinatorics today: David Conlon and Asaf Ferber have posted a 4-page preprint on arXiv that gives exponential improvements in the lower bounds on multicolour diagonal Ramsey numbers, when the number of colours is at least 3 (also see the post by Gil Kalai). The best known lower bounds so far were completely probabilistic, combined with an inductive argument of Leffman that gave better bounds for \geq 4 colours. The new construction on the other hand has a deterministic part to it, which in fact uses finite geometry. In this post I’ll describe their construction, and also show how to improve their bounds further using a simple observation.

The multicolour Ramsey number R(t; \ell) is the minimum n such that any colouring of the edges of K_n using \ell colours gives rise to a monochromatic K_t, which is known to exist by Ramsey’s theorem. Ignoring the lower order terms (which is what the big recent breakthrough of Ashwin Sah improved for \ell = 2), the best upper bound we have for all \ell \geq 3 is R(t; \ell) \leq \ell^{\ell t}, and improving the base of the exponent from \ell^\ell to anything smaller seems out of reach by our current methods. As far as the lower bounds are concerned, the best lower bounds for \ell = 3, 4 are R(t; 3) \geq (\sqrt{3})^t and R(t;4) \geq 2^t, as shown by Erdős in 1940’s. For larger \ell, one can improve the lower bound of R(t;\ell) \geq \sqrt{\ell}^t by using the following observation of Leffman:

R(t; \ell_1 + \ell_2) - 1 \geq (R(t;\ell_1) - 1)(R(t;\ell_2) - 1).

Conlon and Ferber improve all of these bounds as a corollary to the following result:

Theorem 1. For any prime p, R(t; p + 1) > 2^{t/2} p^{t/3 + o(t)}.

This, along with Leffman’s observation, gives an exponential improvement for the known lower bound on R(t;\ell) for all \ell \geq 3.

As the appearance of the prime p in this result might suggest, it uses finite fields. In fact, it uses the basic theory of polar spaces/quadratic forms over finite fields. By giving more precise estimate on a certain count in their proof, I can show the following:

Theorem A. For any prime p, R(t; p + 1) > 2^{t/2} p^{3t/8 + o(t)}.

Note that this is an exponential improvement. Let’s see how we can prove this.

Let p be a prime, and let V be the set of all isotropic vectors in \mathbb{F}_p^t with respect to the symmetric bilinear form B(x, y) = x_1 y_1 + \cdots + x_t y_t. In other words, V = \{(x_1, \dots, x_t) \in \mathbb{F}_p^t : x_1^2 + \cdots + x_t^2 = 0\}. It is well known, and in fact not so hard to show, that |V| \sim p^{t - 1}. Though all we need for the bounds is that |V| = p^{t - O(1)}, which is what Conlon and Ferber show in their paper (more precisely, they show |V| \geq p^{t-2}). Consider the following edge colouring of the complete graph on V: the edge \{x, y\} is coloured with i \in \mathbb{Z} if B(x, y) = i  \pmod{p} with i \neq 0, and if B(x, y) = 0 then it is coloured from the set \{p, p + 1\} uniformly at random and independently for all such edges. Thus we have a (p + 1)-colouring of the complete graph on |V| vertices, where the colours are just the integers 1, \dots, p + 1.

In each colour 1 \leq i \leq p - 1 we can show algebraically that the graph on these edges has no cliques of size larger than t, assuming t is not divisible p, which we will assume (for the exact argument see the paper of Conlon and Ferber). Therefore, we are aiming for a lower bound on R(t+1; \ell), but asymptotically it would be the same for R(t; \ell). Now with the random colouring in the last two colours, Conlon and Ferber show that for n = 2^{t/2}p^{t/3 + o(t)}, there exists a subset of V of size n such that the complete graph on these vertices, with the same edge-colouring as before, has no monochromatic clique of size t in colours p or p + 1. This is done by choosing a random subset where each vertex is chosen with a certain probability which is a function of n. Therefore, we have a random colouring of certain pairs of vectors, and a random subset of all the isotropic vectors.

A crucial step in the estimation used in the proof, that leads to this particular choice of n, is an upper bound on the number of cliques of size t in the graph defined on V where two vectors are adjacent if they are orthogonal to each other, that is, x \sim y if B(x, y) = 0. Note that in our edge colouring, the edges of this graph were the ones that got colours from \{p, p + 1\}. Let’s denote this number by N_t. Each such clique is a collection x_1, \dots, x_t of vectors in \mathbb{F}_p^t such that B(x_i, x_j) = 0 for all 1 \leq i \leq j \leq t. Conlon and Ferber estimate the number N_{t, r} of such cliques that have rank r, that is, the dimension of the space spanned by x_1, \dots, x_t is r, by N_{t, r} \leq p^{2tr - 3r^2/2 + r/2}, and then just use the upper bound N_t = \sum_{r = 1}^{t} N_{t, r} \leq t \max_{r = 1}^t N_{t, r}.

Until this point it’s all fine, but what we can improve upon is the argument that follows. They say that N_{t, r} is maximized for r = 2t/3 + 1/6 (which is not at all hard to see) and thus conclude that N_t \leq p^{2t^2/3 + o(t^2)}. But they miss a crucial linear algebraic fact at this point. The rank of a clique can never be larger than t/2. This is an easy fact that follows from (i) \dim S + \dim S^\perp = t for any subspace S, and (ii) S \subseteq S^\perp for a totally isotropic subspace, which is exactly what the span of the vectors forming the clique is going to be. In terms of the theory of polar spaces (or quadratic forms), it says that the rank of a polar space is always at most half of the dimension of the underlying vector space (something all the finite geometers must be familiar with). From this we see that N_t \leq t N_{t, t/2} \leq p^{5t^2/8 + o(t^2)}. After this, if you do the calculation of Conlon-Ferber then you’ll see that we can take n = p^{3t^2/8  + o(t^2)}, thus proving Theorem A. \square

Remark: If you are familiar with the theory of polar spaces, then you can easily estimate this number N_t, as each such clique must be contained in a generator (maximal totally isotropic subspaces) and you can count the total number of generators (see for example Theorem 4.11 in the book by Simeon Ball). Depending on the type of the polar space, you can even get an exact count but here we are just interested in the asymptotic results.

What’s next?

I think it’s really exciting that finite geometry, or in fact anything not completely probabilistic, can be used in the case of diagonal Ramsey numbers to improve the known bounds. This was recently shown to be true for the two-colour off-diagonal case by Mubayi and Verstraete, which was extended to multicolours by He and Wigderson, as I describe here and here. Therefore, finite geometry is paying a crucial role in developing Ramsey theory. I think a deeper understanding of these constructions, and new ideas from finite geometry (perhaps group theoretic ones) can lead to even further progress.

Update 28/09/20: Yuval Wigderson has managed to improve the bound further for \ell \geq 4 colours (see this).

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , | 20 Comments

The dual version of Ryser’s conjecture

I talked about our new results related to Ryser’s conjecture in a previous post (also see an even earlier post). The conjecture, and its variants, have some interesting equivalent formulations in terms of edge colourings of graphs. While I was vaguely aware of that, I never looked into it in detail. Thanks to the new paper of DeBiasio, Kamel, McCourt and Sheats, I have now decided to study it properly (and share what I understand). In this post I would like to explain and prove this so-called duality.

Consider the following two statements (and try to prove them without reading forward):

(1) Given n points in \mathbb{R}^2, let k be the maximum number of points that you can choose from these such that no two points share an x or a y coordinate. Then these n points can be covered by k axis parallel lines, i.e., horizontal or vertical lines.

(2) Let G be a graph of independence number \alpha whose edges are coloured red or blue. Then the vertices of G can be covered by \alpha monochromatic sub-trees (different trees can have different colours but all the edges of one tree must have the same colour).


While both of these statements do have the common feature of being covering problems, it is not immediately clear that they are in fact both implied by the classical theorem of Kőnig. Ryser’s conjecture is a generalisation of this, which in terms of (1) can be seen as a generalisation to \mathbb{R}^n and in terms of (2) can be seen as a generalisation to r colours. We will focus on (2) here, as the connection to (1) is sort of immediate and I don’t know if the geometric point of view has any clear advantage.

Just as a recap, here is what Ryser conjectured. Given an r-partite r-uniform hypergraph H, we have \tau(H) \leq (r - 1) \nu(H), where \tau is the vertex cover number of a hypergraph and \nu is the matching number.

For a graph G, let tc_r(G) denote the smallest number of monochromatic connected subgraphs whose vertices cover V(G), over all r-edge colourings of G. Here a monochromatic connected subgraph can also be a single vertex. You can in fact take these subgraphs to be the connected components in the graph G_i formed by taking the edges coloured by the i-th colour. If the grah G_i is disconnected, then you get single vertex monochromatic subgraphs. Gyárfás observed that Ryser’s conjecture is equivalent to tc_r(G) \leq (r - 1) \alpha(G), for all graphs G. We will soon see why this is so.

Firstly, to get some familiarity with tc_r(G), observe that the well known fact that for any graph G, either G is a connected graph or its complement G^c is a connected graph, can be stated as tc_2(K_n) \leq 1 for all n. This is because we can look the edges of G and G^c giving us the 2-edge colouring of K_n, where n = |V(G)|. Also note that tc_r(G) \leq n_i, where n_i is the number of connected components in the graph G_i, given an r-edge colouring of G. The notation tc stands for tree cover, which comes from the fact that any connected graph has a spanning tree, and thus instead of covering with monochromatic connected subgraphs we can equivalently cover with monochromatic trees.

Let’s prove that these two conjectures are equivalent. So, say Ryser’s conjecture is true, and let G be a graph with each of its edge given a colour from \{1, \dots, r\}. Denote the subgraph formed by edges whose colour is i by G_i. Construct a hypergraph H as follows. The vertices of H are the connected components of G_i, for i = 1, \dots, r. These vertices are clearly partitioned into r sets, corresponding to the colour i. For each vertex v of G, define a hyperedge of H as the set of all monochromatic connected components that contain v. Each hyperedge has size r since for each i the vertex v is contained in a unique component of G_i. Thus, we have an r-partite r-uniform hypergrah H. See the images below for some concrete example.

Now a matching in H corresponds to a set of vertices in G such that no two of these vertices are together in any monochromatic connected component. In particular, there is no edge between in any two of these vertices. Therefore, \nu(H) \leq \alpha(G). A vertex cover of H corresponds to a collection of monochromatic components that cover all the vertices of G (as they correspond to the hyperedges of H). Therefore, tc_r(G) \leq \tau(H). Since we assumed that Ryser’s conjecture is true, we also have \tau(H) \leq (r - 1) \nu(H). Combining all of these inequalities, we get tc_r(G) \leq (r - 1) \alpha(G).

Now assume that the conjecture of Gyárfás is true, and let H be an r-partite r-uniform hypergraph. Construct a graph G, where each hyperedge is a vertex of G and two vertices are adjacent if the corresonding hyperedges intersect non-trivially. We colour an edge uv of G by the colour i if the hyperedges corresponding to u and v meet each other in the i-th part of the r-partition. If they meet in more than one parts, then pick a colour arbitrarily. Again, for concreteness, we can see that in the first image above starting from the hypergraph on the right hand side we reach the graph on the left hand side. In the second case, however, we get the following:

It follows directly from the definition that independent sets in the graph are in bijective correspondence with matchings in the hypergraph, and thus \alpha(G) = \nu(H). A monochromatic connected component in G, which is in fact a clique, corresponds to the vertex of the hypergraph where all of the hyperedges meet. Therefore, a cover of the vertices of the graph using monochromatic connected components gives rise to a vertex cover of H of the same size, which implies that \tau(H) \leq tc_r(G). Since tc_r(G) \leq (r - 1) \alpha(G) = (r - 1)\nu(H), we get \tau(H) \leq (r - 1) \nu(H), thus proving the equivalence of these conjectures.

This duality can be applied to other variants of Ryser as well. For example, in my work with Shagnik, Patrick and Tibor, we studied the min vertex cover size for the case when the hypergraph is t-intersecting, i.e., any two hyperedges share at least t vertices. This can be seen as determining min tc_r(K_n) with the added condition that in the edge colouring any two vertices are contained in at least t different monochromatic components. Another variant that we studied was that of k-wise t-intersecting hypergraphs. This can be seen as determining the tree cover number of the k-uniform complete hypergraphs, where every set of k vertices are contained in at least t different monochromatic components. In this formulation Király had already solved the problem for k \geq 3, t =1, which we were unaware of, and our work resolves the general case of arbitrary t \geq 1. This complete knowledge of the extremal function for k \geq 3, and very limited knowledge for k = 2 is quite interesting.

One particular advantage of the edge colouring formulation is that we have a richer structure there that can be exploited to study interesting variants that have no natural formulation in terms of hypergraphs. For this, and more, check out the paper of DeBiasio, Kamel, McCourt and Sheats. On the other hand, the hypergraphs formulation somehow feels more natural, and in fact many proofs (for example the r=3 case of Ryser) are best stated in terms of hypergraphs. So in conclusion, it’s a good idea to be aware of both.

Posted in Combinatorics, Extremal Combinatorics, Uncategorized | Tagged , , , , , , , , , , | Leave a comment

Heisenberg groups, irreducible cubics and minimal Ramsey

As I mentioned in a previous post, we recently improved the upper bound on a Ramsey parameter, in collaboration with John Bamberg and Thomas Lesgourgues. My favourite thing about this work is how it ends up using the properties of certain Heisenberg groups associated to generalized quadrangles, and cubic equations, to say something interesting about a problem in graph Ramsey theory. It makes me really happy to see such harmony in mathematics. In this post I will attempt to describe the main construction in our paper without getting into too many details.

As discussed earlier, we want to determine the smallest possible minimum degree of a graph G that has the property that every r-colouring of the edges of G contains a monochromatic clique of size k, and no proper subgraph of G has this property. This smallest degree is denoted by s_r(K_k), and the best known upper bounds on it in various ranges are described in the earlier post. All of these bounds use the equality between s_r(K_{k+1}) and the following packing parameter:

P_r(k) = the minimum n for which there exist K_{k+1}-free pairwise edge disjoint graph G_1, \dots, G_r on a common vertex set V of size n such that for any r-colouring of V there exists a G_i that contains a K_k all of whose vertices are coloured in the i-th colour.

It is known that s_2(K_{k+1}) = P_2(k) = k^2, as proved by Burr, Erdős and Lovász in 1976, but for r > 2 our knowledge of this function is quite limited. (As an exercise, try to find such edge disjoint G_1 and G_2 on a common set of k^2 vertices.)

Since we are just looking at upper bounds for now, one can show P_r(k) \leq n by finding a packing of G_i‘s where each G_i is K_{k+1} free but every collection of n/r vertices in G_i induces a copy of K_k. This works because in any r-colouring of V there must exist a colour i which occurs at least n/r times, and then in the graph G_i we have a K_k all of whose vertices are coloured in the i-th colour. This has been the main approach so far in showing upper bound on s_r(K_{k+1}) for r \geq 3, k \geq 2, and we will be doing the same.

Before finding such a collection of edge disjoint G_i‘s, the first step is to see if we can find even a single graph G on n vertices which is K_{k+1}-free and has the property that any set of n/r vertices in G induce a K_k. For k = 3, we are looking for triangle-free graphs with independence number < n/r. This should remind you of the Ramsey number R(3, t). In fact, such a G on n vertices exists if and only if n < R(3, \lceil n/r \rceil). This connection can in fact be used to prove that s_r(K_3) = \Theta(r^2 \log r) (see Guo and Warnke) by finding not just one triangle-free graph but a packing of triangle-free graphs.

More generally the ideas from the determination of the so-called Erdős-Rogers function were used to prove that for any fixed k, and r large enough, P_r(k) \leq C (\ln r)^{8k^2} r^2 (see Fox, Grinshpun, Liebenau, Person and Szabó). This is pretty satisfactory as far as the realm of k fixed and r \rightarrow \infty is concerned, but in other ranges this is a terrible bound compared to the more general result of FGLPS that shows P_r(k) \leq 8k^6 r^3 for all k \geq 2, r \geq 3. Proofs of all these results rely on finite geometric ideas, combined with the probabilistic method, and that’s exactly the path we take as well.

Here is a general idea to construct these graphs. Consider a partial linear space (a.k.a. a linear hypergraph) that is “triangle-free” in the sense that no three lines pairwise meet each other in three distinct points. Then because of this triangle-freeness the only cliques (of size at least 3) in the graph defined on the point-set with two points adjacent if they are collinear, are those contained inside a line. Therefore, we can try to destroy all of the cliques of size \geq k + 1, while maintaining some structure so that any set of n/r points contains a K_k. This idea, along with a probabilistic argument, can be used to prove the following lemma (which is Lemma 3.1 in our paper) about packings of partial linear spaces:

Lemma 1. Say there exist triangle-free partial linear spaces (P, \mathcal{L}_1), \dots, (P, \mathcal{L}_r), each of order (s, t), such that (P, \cup_{i = 1}^r \mathcal{L}_i) is also a partial linear space. If s \geq 2rk \ln k and t \geq 2k(1 + \ln r), then P_r(k) \leq |P|.

The order (s, t) denotes that every line has s + 1 points on it and every point has t + 1 lines through it. The condition that (P, \cup_{i = 1}^r \mathcal{L}_i) is a partial linear space, is really important, and we overlooked it in our previous version of the paper which led to an error. I would like to thank Simona Boyadzhiyska for pointing this out to me.

The best sources of such triangle-free partial linear spaces, and in fact the only sources I know of, are generalized quadrangles (GQ). One can take any “regular” subhypergraph of a GQ to get these spaces. The main difficulty though is to find a packing of such spaces, as required in Lemma 3.1. In Fox et al. they managed to find such a packing with s = t = q - 1, an arbitrary prime power q, using what is known as the T_2(O) generalized quadrangle (even though they don’t call it that in the paper). Since |P| = q^3 in their packing, we get the upper bound of P_r(k) \leq 8k^6 r^3 by picking a prime q such that rk^2 < q \leq 2rk^2. We have managed to find such a packing with s = q^2 - 1 an t = q - 1, for an arbitrary odd prime power q, that gives us the following improvement:

Theorem 2. P_r(k) \leq C k^5 r^{5/2}.

The construction relies on a group theoretic model of generalized quadrangles, introduced by Bill Kantor in 1980. Consider the Heisenberg group E defined on the set \mathbb{F}_{q^2} \times \mathbb{F}_{q^2} \times \mathbb{F}_{q} by the following group operation (x, z, y) \circ (x', z', y') = (x + x', z + z', y + y' + z^qx' + zx'^q). Note that (a, b) \rightarrow a^q b + ab^q is a bilinear map on \mathbb{F}_{q^2} when identified as the vector space \mathbb{F}_q^2.

Our points will be the elements of E, and thus we have a point set of size q^5. It is known that E contains subgroups A_i, for i \in \mathbb{F}_q, such that (1) |A_i| = q^2; (2) A_i \cap A_j = \{(0, 0, 0\}) for any i \neq j and (3) for any three distinct i, j, k we have (A_i A_j) \cap A_k = \{(0, 0, 0)\}. These subgroups are A_i = \{(a, ai, a^{q+1}i) : a \in \mathbb{F}_{q^2}\}. The partial linear space obtained by taking the set of all cosets of A_i' as lines, is of the order (q^2 - 1, q - 1) and the third condition on these subgroups ensures that it is triangle-free. You should try to prove the triangle-freeness. For the finite geometers reading this post, this partial linear space is a subgeometry of the Hermitian generalized quadrangle H(3, q^2).

We obtain a packing of partial linear spaces isomorphic to this one, on the same point set E, by taking a group of automorphisms of E which then acts naturally on the set \{A_i : i \in \mathbb{F}_q\}. Each such automorphism \tau defines a new set of subgroups A^\tau_i, i \in \mathbb{F}_q, which have the same three properties. What we want from our automorphisms though, because of the condition in Lemma 1 regarding (P, \cup \mathcal{L}_i) being a partial linear space, is that every subgroup in \{A_i : i \in \mathbb{F}_q\} must intersect ever subgroup in \{A_i^\tau : i \in \mathbb{F}_q\} trivially. This turns out to be quite tricky, and for this we had to be very careful in our choice of automorphisms. In particular, one can show that none of the inner automorphisms of E would work. In the end end we managed to find q^2 outer automorphisms of E that give us the required packing of size q^2. Very interestingly, to prove that our packing is good, we had to rely on the following two results on irreducible cubics:

Proposition 1: (Kim-Kim-Yie 2009) There is a bijective correspondence between the set of irreducible polynomials of the form x^3 - cx^2 + c^q x - 1 in \mathbb{F}_{q^2}[x] and the set of irreducible polynomials of the form x^3 + ax^2 + bx + c in \mathbb{F}_q[x] where -b is a non-square in \mathbb{F}_q.

Proposition 2: For every finite field \mathbb{F}_q, and every \lambda \in \mathbb{F}_q, there exists an irreducible cubic x^3 + ax^2 + bx + c where b = \lambda.

This can be shown via the Hansen-Mullen Irreducibility Conjecture (which was proved by Wan for all but finitely many cases, and those small cases were handled by Ham and Mullen). See this recent paper of Tuxanidy and Wang, and the references therein, for more on this conjecture and its generalizations where we look for irreducible polynomials with more than one of its coefficients prescribed.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , , , | 2 Comments

Generalized polygons in extremal combinatorics

Jacques Tits invented generalized polygons to give a geometrical interpretation of the exceptional groups of Lie type. The prototype of these incidence geometries already appears in his 1956 paper, while they are axiomatically defined in his influential 1959 paper on triality. The finite versions of these objects have played an important role in the study of finite simple groups and in the general the theory of permutation groups. In the 70’s and 80’s, generalized polygons also started being studied under the theory of distance regular graphs (and more broadly association schemes). The theory of these geometries has been developed for its own sake as well (see this and this), and there are several long standing open problems about them that have attracted the attention of many mathematicians. In this post I want to discuss the deep connection that generalized polygons have had with extremal combinatorics, specifically with Ramsey theory and Turán type theorems.

Finite projective planes, which are in fact a special case of generalized polygons known as generalized triangles, have always been deeply connected to extremal combinatorics. They give us the optimal examples for several extremal problems on both graphs and hypergraphs. For example, they were behind the construction of Esther Klein, mentioned in the 1938 paper of Paul Erdős, which in fact determines the maximum number of edges in a C_4-free graph on n vertices asymptotically (it’s fun to look at the original number theoretic problem which motivated this construction and also the fact that there is no mention of the term ‘projective planes’ even though the concept is there). They are also present in the early constructions for some special cases of the classical Zarankiewicz problem. These early results form the foundation of extremal graph theory.

Finite generalized quadrangles (GQ) have played a similar role, but I have noticed that many times they don’t get mentioned explicitly by the extremal combinatorics community. There appears to be a gap in knowledge, which is probably because of the slightly more involved constructions of classical GQ’s. The so-called Benson graph of girth 8, that give us the asymptotic result for the Turán number of C_6,  is simply the incidence graph of a GQ. While Benson doesn’t say that explicitly in his 1966 paper, it’d be very surprising if he actually din’t know it as his PhD thesis was called Generalized Quadrangles and (B,N) Pairs. Similarly, the so-called Wenger graph from 1991, is just an induced subgraph of the incidence graph of a well known GQ, even though there is no mention of generalized quadrangles in Wenger’s paper. See my earlier blog post on Wenger graph for more details on this. Once you see that, it should be clear how Kopparty’s graph is related to the same GQ. Thus,  the best known construction of optimal pseudorandom triangle-free graphs can be obtained via generalized quadrangles! This was also shown by Conlon, who started with the collinearity graph of a generalized quadrangle and gave a random construction from it that resulted in almost optimal triangle-free pseudorandom graphs. The basic idea behind these two constructions is in fact the same, just that Kopparty uses an algebraic method to achieve the end goal whereas Conlon uses a probabilistic one (which is why he ends up with an almost optimal graph).

These optimal triangle-free pseudorandom graphs give us the best known constructive lower bounds on the Ramsey number R(3, t). As these can be obtained via generalized quadrangles, we see an application of GQ’s in Ramsey theory. Interestingly, GQ’s can also be used to give us the best known constructive lower bounds on the Ramsey number R(4, t), as shown by Kostochka, Pudlák and Rödl. They have even been used in some hypergraph Ramsey problems to obtain tight lower bounds that beat the probabilistic ones (see Kostochka-Mubayi-Verstraëte). There are some other problems in Ramsey theory where GQ’s have shown up as well, see for example this and this.

In my recent work with John Bamberg and Thomas Lesgourgues on a minimal Ramsey problem we have also used generalized quadrangles. We noticed that a particular group theoretic model of GQ’s, due to Kantor, was perfectly suitable for this problem. I am certain that this idea will be useful in other problems in extremal combinatorics problems as well. I also believe that new interesting applications of generalized hexagons and octagons, as in the recent work of Mubayi and Verstraete (also described here), are just waiting to be found. One of the necessary steps for this is to learn more about these beautiful mathematical objects, and for that I would like to end with a few references.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory, Uncategorized | Tagged , , , , , , , , , , | Leave a comment

Minimal Ramsey problems

Thanks to Anita Liebenau, I have recently been introduced to some very interesting questions in Ramsey theory and I have been working on them for the past few months in collaboration with various people. In my recent joint work with John Bamberg and Thomas Lesgourgues (Anita’s PhD student), that has just appeared on the arXiv, we have  proved an interesting new result in one of these problems. Before I discuss that, let’s look at some background in this post.

The well known fact that in any party of six people either at least three of them are mutual strangers or at least three of them are mutual acquaintances (see this or this), which serves as a nice introduction to Ramsey theory, can be written in graph theoretical terms as follows: any two colouring of the edges of the complete graph K_6 contains a monochromatic K_3. Let’s write this statement as K_6 \longrightarrow (K_3)_2, which we read as “K_6 is 2-Ramsey for K_3”.

We can generalise this notion to any two finite graphs G and H; let’s write G \longrightarrow (H)_r if any r-colouring of the edges of G contains a monochromatic copy of H, and hence G \longarrownot\longrightarrow (H)_r is a shorthand for the statement that here exists an r-colouring of the edges without a monochromatic H. Observe that if G \longrightarrow (H)_r then for any graph G' containing G as a subgraph we have G' \longrightarrow (H)_r, or equivalently, if G \longarrownot\longrightarrow (H)_r then for any subgraph G' of G we have G' \longarrownot\longrightarrow (H)_r. For example, here is a red/blue colouring of K_6 minus an edge that has no monochromatic triangles, thus showing that K_6 - e \longarrownot\longrightarrow (K_3)_2.

This example in fact shows that the graph K_6 is minimal with respect to being 2-Ramsey for K_3, since K_6 is 2-Ramsey for K_3 but no proper subgraph of K_6 has this property.

In general, we say that a graph G is r-Ramsey minimal for another graph H, if G \longrightarrow (H)_r and for any proper subgraph G' of G we have G' \longarrownot\longrightarrow (H)_r. Let’s denote the set of all graphs G that are r-Ramsey minimal for a graph H by \mathcal{M}_r(H). Most of the fundament problems in graph Ramsey theory can be seen as trying to understand this set for various graphs H. The classical result of Ramsey can be rephrased as:

Theorem 1. (Ramsey, 1930) The set \mathcal{M}_r(H) is non-empty for any finite graph H.

The well studied r-colour Ramsey number of a graph H, denoted by R_r(H) is simply the smallest number of vertices among all graphs in \mathcal{M}_r(H). Determining, or even estimating, the number R_2(K_k), which is denoted by R(k), has been one of the central longstanding open problem in combinatorics. The work done towards resolving this has led to significant development in combinatorics.  It was shown by Erdős and Szekeres  in 1935 that R(k) \leq 2^{2k}, and in 1947 Erdős proved the lower bound of R(k) \geq 2^{k/2} using a probabilistic argument that led to the creation of what we now call the probabilistic method in combinatorics. But, despite decades of research we still haven’t been able to improve the base of the exponents in these upper and lower bounds! For the most recent development on the upper bounds see this paper by Ashwin Sah.

Other parameters like the number of edges, chromatic number and maximum degree of graphs in \mathcal{M}_r(H), have also been studied extensively. The question that I have been looking into is determining the smallest minimum degree in \mathcal{M}_r(H), which is denoted by s_r(H). In particular, the new result concerns s_r(k) = s_r(K_k).

For any graph H, the graph K_n with n = R_r(H) is r-Ramsey for H, but it might not be minimal. Still, we can look at a subgraph of K_n which is minimal to conclude that s_r(H) \leq \delta(K_n) = n - 1 = R_r(H) - 1. We can see that this gives us the bound s_2(k) \leq 2^{2k} - 1, using the upper bound on R(k) I mentioned above. In fact, we can’t hope to get any subexponential upper bound on s_2(k) using this easy argument as we also saw that R(k) \geq 2^{k/2}. Quite surprisingly, we do know s_2(k) exactly (which is something that rarely happens in Ramsey theory), and it is far far away from an exponential function!

In 1976, Burr, Erdős and Lovász proved that s_2(k) = (k - 1)^2. Thus, they showed that there exists a graph in \mathcal{M}_r(K_k) with minimum degree equal to (k - 1)^2, and there can’t be any graph in this set with a smaller minimum degree. Determining s_r(k) for r > 2 is still widely open. It’s an instructive exercise to try to come up with a graph in \mathcal{M}_2(K_3) with minimum degree 4. The proof of the upper bound by Burr, Erdős and Lovász uses the existence of certain (very large) graphs, known as signal senders, that are (i) not 2-Ramsey for K_k and (ii) in any such monochromatic H-free colouring of these graphs two specific edges of the graph receive a specific colour pattern. These signal senders are then used to construct 2-Ramsey minimal graphs with the required minimum degree. The same idea has been very fruitful in determining s_r(H) for other graphs H, and some related problems (see this, this, this and this).

Signal senders are also an important ingredient in the work of Fox, Grinshpun, Liebenau, Person and Szabó,  where they reduce the problem of determining s_r(k) to another extremal problem on graph packings. This reduction helped them determine s_r(k) up-to polylogarithmic factors assuming k \geq 3 is a constant and r \rightarrow \infty. The constants C_k in this bound were terribly large, and in particular not polynomial in k. Therefore, they proved the general upper bound of s_r(k) \leq 8 (k - 1)^6 r^3. Pushing their idea further, and using a nice finite geometric and group theoretic ingredient that I haven’t seen used anywhere in extremal combinatorics so far,  we have managed to improve this general upper bound.

Theorem 2. (Bamberg, Bishnoi, Lesgourgues) There exists an absolute constant C such that for all r \geq 2 and k \geq 3, we have s_r(k) \leq C (k - 1)^5 r^{5/2}.

I will discuss the details of this result in the next post, and explain the finite geometric aspect of it which I find super interesting. I am really hopeful that the ideas involved will be useful in obtaining new results in other Ramsey problems.

 

 

 

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , | 2 Comments

Extending Ryser’s conjecture

In an earlier post, I talked about Ryser’s conjecture on r-partite r-uniform hypergraphs, that has stayed open for all r \geq 4 despite a considerable effort by several mathematicians over a period of 50 years. A bit more effort has been spent on the special case of intersecting hypergraphs, that is, hypergraphs in which every two edges share at least one common vertex. Here the wishful thinking is that this will shed some light on the original conjecture (which one might expect, but there are some surprises), or mathematicians are just following Pólya’s principle: “If you can’t solve a problem, then there is an easier problem you can solve: find it.”. Sadly, we can’t even solve this easier problem; the conjecture is open for intersecting hypergraphs for all r \geq 6. Therefore, it is quite natural to look for simpler problems by adding more restrictions. One of the restrictions is to consider strictly 1-intersecting, a.k.a., linear hypergraphs, that is, every two distinct edges share exactly one common vertex. This seems quite strong, but even in this situation we only know the conjecture to be true for r \leq 9.

Here are two alternate conditions one could impose that are stronger than the hypergraph being intersecting:

  1. (k-wise intersection) any collection of k edges contains a common vertex.
  2. (t-intersection) any two distinct edges have at least t vertices in common.

Note that in (1.) we can have repetitions in the collection of edges, and thus in particular k-wise intersection implies k'-wise intersection for any k' < k.

In this post I want to share some good news regarding these two new settings. In a joint work with my former colleagues, Shagnik Das, Patrick Morris and Tibor Szabó we have completely solved (1.) for all k \geq 3. Interestingly, we used some new results that we proved for (2.) in doing that. To describe our results, let us first define the following function which is central to Ryser’s conjecture:

\mathrm{Ryser}(r, t) = \max \{\tau(H) : H \text{ is a } t\text{-intersecting } r\text{-uniform } r\text{-partite hypergraph}\},

where \tau(H) is the vertex cover number of H, that is, the smallest number of vertices that cover all edges.

Ryser’s conjecture (for intersecting hypergraphs) states that \mathrm{Ryser}(r, 1) \leq r - 1 (note that every edges is a vertex cover, and hence \mathrm{Ryser}(r,1) \leq r is trivially true). If true, this bound will be sharp for infinitely many values of r.

The problem of determining \mathrm{Ryser}(r, t) was introduced a few years back independently by BustamanteStein, and KirályTóthmérész, who made an analogous conjecture for this function that says \mathrm{Ryser}(r, t) \leq r - t. While Ryser’s conjecture is a special case of this one, it is in fact implied by Ryser’s conjecture, that is, \mathrm{Ryser}(r,1) \leq r - 1 ~ \forall r \implies \mathrm{Ryser}(r,t) \leq r - t ~\forall r > t \geq 1. Therefore, one might hope to make progress over this new conjecture for larger values of t, even though t = 1 seems out of reach at the moment. This is what was done, as Bustamante and Stein proved the conjecture for t \geq (r - 2)/2 whereas Király and Tóthmérész proved it for t \geq (r + 1)/4. Moreover, Bustamante and Stein observed that their conjecture is not sharp as \mathrm{Ryser}(5, 2) = 2 instead of being equal to 5 - 2 = 3. This, was the starting point of our work. In fact, I have picture from the exact moment where we started working on proving that the conjecture might not be sharp for other values:

IMG_20180906_121630.jpg

Tibor and Patrick thinking hard about \mathrm{Ryser}(r,t)

This is from the 2018 research group workshop organised by Tibor where I had proposed this problem and the three of us worked on it for two days. We managed to prove that for large enough t, the correct value of \mathrm{Ryser}(r,t) is in fact half of what was conjectured:

Theorem 1. \mathrm{Ryser}(r,t) = \lfloor (r - t)/2 \rfloor + 1, for all t + 1 \leq r \leq 3t - 1.

Once we came back to Berlin from our workshop, Shagnik joined our team:

IMG_20180906_154233.jpg

This is him during the workshop when he was helping us cook Goulash, not realising that helping us on Ryser’s conjecture could also be fun!

He proved that Theorem 1 in fact implies the following result on the function

\mathrm{Ryser}(r, t, k) = \max \{\tau(H) : H \text{ is a } k\text{-wise } t\text{-intersecting } r\text{-uniform } r\text{-partite hypergraph}\}.

Theorem 2. For any k \geq 3 and t \geq 1, \mathrm{Ryser}(r,t,k) = \lfloor (r - t)/k \rfloor + 1.

It is quite interesting that the sharp bound that we prove in Theorem 1 is precisely what was needed for Theorem 2.

In our paper, we prove various other things, which includes showing the bound \mathrm{Ryser}(r,t) \leq r - t for a larger range of values of t, and the same bound for strictly t-intersecting hypergraphs for a much larger range of values of t. We also pose some nice open problems, and make the following bold conjecture (that only a proper subset of us believe):

Conjecture. \mathrm{Ryser}(r,t) = \lfloor (r - t)/2 \rfloor + 1 for all 2 \leq t \leq r - 1.

I can definitely say that despite our beliefs all of us would be happy to see a counterexample to this Conjecture.

We also introduce another variation of the problem where instead of a normal vertex cover, one looks at s-covers, that is, every edge must share at least s vertices with the covering set. Here we make some modest progress and hope that it will inspire some future work.

Posted in Combinatorics, Extremal Combinatorics | Tagged , , , , , , , | 1 Comment

Bounds on Ramsey numbers from finite geometry

In an earlier post I talked about the work of Mubayi and Verstraete on determining the off-diagonal Ramsey numbers via certain optimal pseudorandom graphs, which are not yet known to exist except for the case of triangles. Beyond this conditional result, they also improved the lower bounds on certain cycle-complete Ramsey numbers using generalized polygons. In some cases they even beat the exponent provided by the general F-free process studied by Bohman and Keevash, which

“appears to be the first instance of a graph F containing cycles for which random graphs do not supply the right exponent for r(F, k).”

(I have changed the t from their paper to k for my own convenience). In this post I will talk about the basics of their construction, hoping that these ideas can be used in future to obtain similar results on Ramsey numbers.

One of the fundamental objects that are studied in finite geometry are partial linear spaces, a.k.a., linear hypergraphs. These are point-line geometries with the property that through any two points there is at most one line, or equivalently, these are hypergraphs in which every two hyperedges share at most one common vertex. A partial linear space has order (s, t) if ever line contains exactly s + 1 points and through every point there are exactly t + 1 lines (in hypergraph terminology we have an (s + 1)-uniform (t + 1)-regular hypergraph). For every partial linear space we can define the collinearity graph on the points by making to points adjacent if they lie on a common line (in hypergraph terminology this is sometimes referred to as the skeleton of the hypergraph). Note that the lines of the partial linear space correspond to cliques of size s + 1 in the collinearity graph, and any two such cliques meet in at most a vertex. From now on denote the number of points in the partial linear space by n and the number of lines by m. In several classes of such spaces these numbers are a function of s, t. For example, in the case of a finite projective plane we have s = t and n = m = (s - 1)^2 + (s - 1) + 1. Note that we always have n(t + 1) = m(s + 1).

For a graph F, the Ramsey number r(F, k) can be defined as the largest number of vertices for which there exists a graph that has no subgraphs isomorphic to F, and has independence number < k (the existence of such a maximum is a special case of Ramsey’s theorem). We would like to know the growth of this function as k \rightarrow \infty for various kinds of fixed size graphs F. When F is the complete graph, then this is the classical off-diagonal Ramsey number that we saw in the earlier post.

The basic idea behind the construction that I want to discuss in this post is as follows:

start with a partial linear space for which all copies of F in the collinearity graph are contained inside the cliques corresponding to the lines and then destroy all of these copies of F while keeping the independence number under control.

Let’s start with the independence number. If I is a set of pairwise non-collinear points, then by an easy double count of pairs (p, \ell) where p is a point of I and \ell is a line through p, we get |I|(t + 1) \leq m. Therefore, the independence number is always at most m/(t + 1).

If the graph F contains odd cycles (which is the case we will focus on), then one easy way of getting rid of all the copies of F is to convert each clique of size s + 1, which is where all copies of F lie by our assumption, into a bipartite graph on the same set of vertices by destroying some edges. To be more precise, for each line \ell we take two disjoint subsets A_\ell, B_\ell of the points on \ell, with A_\ell \cup B_\ell = \ell and we define a graph on the set of all points where two points x and y are adjacent if (i) they lie in a common line \ell and (ii) x \in A_\ell, y \in B_\ell or x \in B_\ell, y \in A_\ell. This graph is well defined because we have a partial linear space, that is, no two points can lie in two different lines.

While what we do above ensures that there are no copies of F in our collinearity graph, destroying edges would typically increase the independence number of the graph. What we can show though is that there does exist a choice of bipartition on each line so that the independence number cannot increase by much. We will show this by an easy probabilistic argument.

Here is our probability space. For each line \ell, we choose a bipartition (A_\ell, B_\ell) of the set of s+1 points on \ell uniformly at random (from the set of all bipartitions). Then for any line \ell, the probability that a point x of \ell lies in A_\ell (or B_\ell) is equal to 1/2. We do this independently for each line.

Let I be a subset of points, of size k, and for each line \ell let k_\ell be the number of points of I contained in \ell. The probability that I is an independent set in the graph G obtained by removing all edges within A_\ell and B_\ell in the collinearity graph, for each line \ell, is equal to the probability that for every line \ell the set I \cap \ell is either a subset of A_\ell or B_\ell. Since all these are independent events, with an event corresponding to \ell having probability (1/2)^{k_\ell} + (1/2)^{k_\ell} = 2^{1 - k_\ell}, the probability that I is an independent set is equal to \prod_{\ell} 2^{1 - k_\ell} = 2^{m - \sum k_\ell} = 2^{m - (t + 1)k}. The probability that there exists an independent set of size k is then at most \binom{n}{k} 2^{m - (t + 1)k}, by the union bound. Therefore, to show the complementary event that there exists a bipartition of each line for which the graph defined above has no independent sets of size k, it suffices to show that this probability is < 1. By the inequality \binom{n}{k} \leq n^k = 2^{k \log n}, we see that it suffices to show

2^{k \log n + m - (t + 1)k} < 1.

This is true if and only if k > m/(t + 1 - \log n). Thus, for, say k = \lceil m/(t - \log n) \rceil, we get R(F, k) \geq n as there exists an F-free graph on n vertices with independence number < k.

This shows that to get a good lower bound on R(F, k), what we want from our partial linear space of order (s, t) on n points and m lines is \log n << t and n >> m, beyond the condition we started with of all copies of F being contained in the (s + 1)-cliques.

If you are still with me right now, then here is a concrete example. We know that any triangle in the collinearity graph of a generalized quadrangle (or see this) must be contained in the clique corresponding to a line. We also know that a generalized quadrangle (GQ for short) of order (s, t) has n = (s + 1)(st + 1) points and m = (t + 1)(st + 1) lines. For n >> m, we would thus want s >> t and for t >> \log n we would want t to not be a constant. Considering this, the best choice that we have with us here is s = t^2, and t \rightarrow \infty, as we know that in any GQ one always has s \leq t^2. Such GQ’s are known to exist whenever t is a prime power (see Chapter 5 of my lecture notes). From them we get the lower bound R(3, k) \geq (1 + o(1))k^{5/3}, which is not that great considering we know R(3, k) = \Theta(k^2/\log k). But there is some good news, as Mubayi and Verstraete have shown. If we consider generalized hexagons of order (q^3, q), then we get

R(C_5, k) \geq (1 + o(1))k^{11/8},

and from generalized octagons of order (q^2, q) we get

R(C_7, k) \geq (1 + o(1)) k^{11/9}.

Both of these bounds are better than what one would get from purely probabilistic arguments.

I hope that by using other finite geometries, or by improving the argument above, we would be able to improve bounds on other Ramsey numbers as well.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , , , | 3 Comments

Kopparty’s graph

Alon’s construction of optimal pseudorandom graphs from 1994 is useful for obtaining several interesting combinatorial results in various areas of mathematics, some of which are highlighted in this survey of Noga from last year (also see my previous post). In this post I would like to discuss an alternate construction, due to Swastik Kopparty, that has the same properties and thus can be used instead to prove all of these interesting results. It seems like this construction is not so well known (I learned about it from a conversation with Noga), even though it is a bit simpler to describe and in some sense more general. The construction appears in Kopparty’s lecture notes on Cayley Graphs.

What we are looking for is an infinite sequence of triangle-free d-regular graphs on n vertices with d = \Theta(n^{2/3}) and the second largest eigenvalue equal to \Theta(n^{1/3}).

Let p \neq 3 be a prime (it will become clear later why we are excluding 3), and let \mathbb{F}_{q} be a finite field of characteristic p, that is q = p^h for some integer h. For a subset S of the vector space V = \mathbb{F}_q^3 we define the Cayley graph G = Cay(V, S) as the graph on V with u \sim v if u - v \in S. This graph is undirected if for every s \in S we have -s \in S and loopless if 0 \not \in S, both of which we will assume. Kopparty’s construction, much like Alon’s construction, is a Cayley graph of this form for some suitably chosen S. The property of S that ensures that G is triangle-free is that s_1 + s_2 + s_3 \neq 0 for any s_1, s_2, s_3 \in S (not necessarily distinct). This explains why we assume p \neq 3, as otherwise s + s + s = 0 for every s \in S. Consider the following set:

S = \{(xy, xy^2, xy^3) : x \in T, y \in \mathbb{F}_q^{\times}\}

where T is a subset of \mathbb{F}_q with the property that no three elements of T sum to 0, that is, the same property that we wanted from S. Any such T implies triangle-freeness of G: if x_1(y_1, y_1^2, y_1^3) + x_2(y_2, y_2^2, y_2^3) + x_3(y_3, y_3^2, y_3^3) = 0, then either y_1 = y_2 = y_3 and we have a contradiction because of the property of T or we get a non-trivial linear combination of the rows of a Vandermonde matrix equal to 0, which is also a contradiction (note that 0 \not \in T and y_i \neq 0 for all i).

The specific choice of T will soon become clear, but for now let us focus on the degree and eigenvalues of G. The degree is equal to |S| = |T|(q - 1), while the number of vertices is q^3 (thus we would later require |T| = \Theta(q)). The eigenvalues of Cayley graphs, at least for the case of finite abelian groups, are easy to describe. I recommend chapter 5 of the lecture notes of Luca Trevisan for the basic theory. For a = (a_1, a_2, a_3) \in V, the eigenvalue \lambda_a corresponding to the eigenvector \chi_a : V \rightarrow \mathbb{C} defined by \chi_a(x_1, x_2, x_3) = \omega^{\mathrm{Tr}(a_1 x_1 + a_2 x_2 + a_3 x_3)} satisfies \lambda = \sum_{x \in S} \chi_a(x), where \mathrm{Tr} is the absolute trace function defined by \mathrm{Tr}(\alpha) = \alpha + \alpha^p + \cdots + \alpha^{p^{h - 1}}, and \omega = e^{2 \pi i/p} is a primitive p-th root of unity.

The eigenvalue corresponding to (0, 0, 0) is d = |S|, which is the largest eigenvalue. So let a = (a_1, a_2, a_3) \in V \setminus \{0\}. We have

\lambda_a =  \sum_{y \in \mathbb{F}_q^\times} \sum_{x \in T} \omega^{\mathrm{Tr}(x f(y))},

where f(y) = a_1 y + a_2y^2 + a_3 y^3. We now make the choice of T = \{u \in \mathbb{F}_q : \mathrm{Tr}(u) \in \{-1, 1\}\}. Note that |T| = 2q/p, unless p = 2 when |T| = q/p, T = -T and no three elements of T can sum to 0, as p \neq 3. From the following lemma, whose proof is a fun exercise in algebra, it follows that several of the terms in the expression for \lambda_a are 0.

Lemma. Let z \in \mathbb{F}_q \setminus \mathbb{F}_p. Then for every k \in \mathbb{F}_p we have \sum_{x \in \mathrm{Tr}^{-1}(k)} \omega^{\mathrm{Tr}(x z)} = 0.

Thanks to this Lemma we can write

\lambda_a = \sum_{y \in f^{-1}(\mathbb{F}_p) \setminus \{0\}} \sum_{x \in T} \omega^{\mathrm{Tr}(xf(y))}.

For any fixed k \in \mathbb{F}_p, the number of y for which f(y) = a_1 y + a_2 y^2 + a_3 y^3 = k, is at most 3 since it is an equation of degree at most 3 (and at least $1$) in one variable. Therefore, the number of summands is at most 3p|T|, which implies that |\lambda_a| \leq 3p|T|.

To summarise, we have a triangle-free graph on q^3 vertices which is d-regular with d = 2q(q - 1)/p and has all eigenvalues other than d at most 6 q in absolute value. For fixed p and h \rightarrow \infty, we get the infinite family that we were looking for.

Remark 1: Other choices of T will work as well, for example T = \{u \in \mathbb{F}_q : p/3 < \mathrm{Tr}(u) < 2p/3\}.

Remark 2: This can be generalised to an odd-cycle free construction, just like Alon’s graph can be generalised. Any generalisation to even-cycle free constructions, or to K_s-free, with s \geq 4, would be very interesting!

Posted in Combinatorics, Extremal Combinatorics, Spectral Graph Theory | Tagged , , , , , | 3 Comments