Extending Ryser’s conjecture

In an earlier post, I talked about Ryser’s conjecture on r-partite r-uniform hypergraphs, that has stayed open for all r \geq 4 despite a considerable effort by several mathematicians over a period of 50 years. A bit more effort has been spent on the special case of intersecting hypergraphs, that is, hypergraphs in which every two edges share at least one common vertex. Here the wishful thinking is that this will shed some light on the original conjecture (which one might expect, but there are some surprises), or mathematicians are just following Pólya’s principle: “If you can’t solve a problem, then there is an easier problem you can solve: find it.”. Sadly, we can’t even solve this easier problem; the conjecture is open for intersecting hypergraphs for all r \geq 6. Therefore, it is quite natural to look for simpler problems by adding more restrictions. One of the restrictions is to consider strictly 1-intersecting, a.k.a., linear hypergraphs, that is, every two distinct edges share exactly one common vertex. This seems quite strong, but even in this situation we only know the conjecture to be true for r \leq 9.

Here are two alternate conditions one could impose that are stronger than the hypergraph being intersecting:

  1. (k-wise intersection) any collection of k edges contains a common vertex.
  2. (t-intersection) any two distinct edges have at least t vertices in common.

Note that in (1.) we can have repetitions in the collection of edges, and thus in particular k-wise intersection implies k'-wise intersection for any k' < k.

In this post I want to share some good news regarding these two new settings. In a joint work with my former colleagues, Shagnik Das, Patrick Morris and Tibor Szabó we have completely solved (1.) for all k \geq 3. Interestingly, we used some new results that we proved for (2.) in doing that. To describe our results, let us first define the following function which is central to Ryser’s conjecture:

\mathrm{Ryser}(r, t) = \max \{\tau(H) : H \text{ is a } t\text{-intersecting } r\text{-uniform } r\text{-partite hypergraph}\},

where \tau(H) is the vertex cover number of H, that is, the smallest number of vertices that cover all edges.

Ryser’s conjecture (for intersecting hypergraphs) states that \mathrm{Ryser}(r, 1) \leq r - 1 (note that every edges is a vertex cover, and hence \mathrm{Ryser}(r,1) \leq r is trivially true). If true, this bound will be sharp for infinitely many values of r.

The problem of determining \mathrm{Ryser}(r, t) was introduced a few years back independently by BustamanteStein, and KirályTóthmérész, who made an analogous conjecture for this function that says \mathrm{Ryser}(r, t) \leq r - t. While Ryser’s conjecture is a special case of this one, it is in fact implied by Ryser’s conjecture, that is, \mathrm{Ryser}(r,1) \leq r - 1 ~ \forall r \implies \mathrm{Ryser}(r,t) \leq r - t ~\forall r > t \geq 1. Therefore, one might hope to make progress over this new conjecture for larger values of t, even though t = 1 seems out of reach at the moment. This is what was done, as Bustamante and Stein proved the conjecture for t \geq (r - 2)/2 whereas Király and Tóthmérész proved it for t \geq (r + 1)/4. Moreover, Bustamante and Stein observed that their conjecture is not sharp as \mathrm{Ryser}(5, 2) = 2 instead of being equal to 5 - 2 = 3. This, was the starting point of our work. In fact, I have picture from the exact moment where we started working on proving that the conjecture might not be sharp for other values:


Tibor and Patrick thinking hard about \mathrm{Ryser}(r,t)

This is from the 2018 research group workshop organised by Tibor where I had proposed this problem and the three of us worked on it for two days. We managed to prove that for large enough t, the correct value of \mathrm{Ryser}(r,t) is in fact half of what was conjectured:

Theorem 1. \mathrm{Ryser}(r,t) = \lfloor (r - t)/2 \rfloor + 1, for all t + 1 \leq r \leq 3t - 1.

Once we came back to Berlin from our workshop, Shagnik joined our team:


This is him during the workshop when he was helping us cook Goulash, not realising that helping us on Ryser’s conjecture could also be fun!

He proved that Theorem 1 in fact implies the following result on the function

\mathrm{Ryser}(r, t, k) = \max \{\tau(H) : H \text{ is a } k\text{-wise } t\text{-intersecting } r\text{-uniform } r\text{-partite hypergraph}\}.

Theorem 2. For any k \geq 3 and t \geq 1, \mathrm{Ryser}(r,t,k) = \lfloor (r - t)/k \rfloor + 1.

It is quite interesting that the sharp bound that we prove in Theorem 1 is precisely what was needed for Theorem 2.

In our paper, we prove various other things, which includes showing the bound \mathrm{Ryser}(r,t) \leq r - t for a larger range of values of t, and the same bound for strictly t-intersecting hypergraphs for a much larger range of values of t. We also pose some nice open problems, and make the following bold conjecture (that only a proper subset of us believe):

Conjecture. \mathrm{Ryser}(r,t) = \lfloor (r - t)/2 \rfloor + 1 for all 2 \leq t \leq r - 1.

I can definitely say that despite our beliefs all of us would be happy to see a counterexample to this Conjecture.

We also introduce another variation of the problem where instead of a normal vertex cover, one looks at s-covers, that is, every edge must share at least s vertices with the covering set. Here we make some modest progress and hope that it will inspire some future work.

Posted in Combinatorics, Extremal Combinatorics | Tagged , , , , , , , | Leave a comment

Bounds on Ramsey numbers from finite geometry

In an earlier post I talked about the work of Mubayi and Verstraete on determining the off-diagonal Ramsey numbers via certain optimal pseudorandom graphs, which are not yet known to exist except for the case of triangles. Beyond this conditional result, they also improved the lower bounds on certain cycle-complete Ramsey numbers using generalized polygons. In some cases they even beat the exponent provided by the general F-free process studied by Bohman and Keevash, which

“appears to be the first instance of a graph F containing cycles for which random graphs do not supply the right exponent for r(F, k).”

(I have changed the t from their paper to k for my own convenience). In this post I will talk about the basics of their construction, hoping that these ideas can be used in future to obtain similar results on Ramsey numbers.

One of the fundamental objects that are studied in finite geometry are partial linear spaces, a.k.a., linear hypergraphs. These are point-line geometries with the property that through any two points there is at most one line, or equivalently, these are hypergraphs in which every two hyperedges share at most one common vertex. A partial linear space has order (s, t) if ever line contains exactly s + 1 points and through every point there are exactly t + 1 lines (in hypergraph terminology we have an (s + 1)-uniform (t + 1)-regular hypergraph). For every partial linear space we can define the collinearity graph on the points by making to points adjacent if they lie on a common line (in hypergraph terminology this is sometimes referred to as the skeleton of the hypergraph). Note that the lines of the partial linear space correspond to cliques of size s + 1 in the collinearity graph, and any two such cliques meet in at most a vertex. From now on denote the number of points in the partial linear space by n and the number of lines by m. In several classes of such spaces these numbers are a function of s, t. For example, in the case of a finite projective plane we have s = t and n = m = (s - 1)^2 + (s - 1) + 1. Note that we always have n(t + 1) = m(s + 1).

For a graph F, the Ramsey number r(F, k) can be defined as the largest number of vertices for which there exists a graph that has no subgraphs isomorphic to F, and has independence number < k (the existence of such a maximum is a special case of Ramsey’s theorem). We would like to know the growth of this function as k \rightarrow \infty for various kinds of fixed size graphs F. When F is the complete graph, then this is the classical off-diagonal Ramsey number that we saw in the earlier post.

The basic idea behind the construction that I want to discuss in this post is as follows:

start with a partial linear space for which all copies of F in the collinearity graph are contained inside the cliques corresponding to the lines and then destroy all of these copies of F while keeping the independence number under control.

Let’s start with the independence number. If I is a set of pairwise non-collinear points, then by an easy double count of pairs (p, \ell) where p is a point of I and \ell is a line through p, we get |I|(t + 1) \leq m. Therefore, the independence number is always at most m/(t + 1).

If the graph F contains odd cycles (which is the case we will focus on), then one easy way of getting rid of all the copies of F is to convert each clique of size s + 1, which is where all copies of F lie by our assumption, into a bipartite graph on the same set of vertices by destroying some edges. To be more precise, for each line \ell we take two disjoint subsets A_\ell, B_\ell of the points on \ell, with A_\ell \cup B_\ell = \ell and we define a graph on the set of all points where two points x and y are adjacent if (i) they lie in a common line \ell and (ii) x \in A_\ell, y \in B_\ell or x \in B_\ell, y \in A_\ell. This graph is well defined because we have a partial linear space, that is, no two points can lie in two different lines.

While what we do above ensures that there are no copies of F in our collinearity graph, destroying edges would typically increase the independence number of the graph. What we can show though is that there does exist a choice of bipartition on each line so that the independence number cannot increase by much. We will show this by an easy probabilistic argument.

Here is our probability space. For each line \ell, we choose a bipartition (A_\ell, B_\ell) of the set of s+1 points on \ell uniformly at random (from the set of all bipartitions). Then for any line \ell, the probability that a point x of \ell lies in A_\ell (or B_\ell) is equal to 1/2. We do this independently for each line.

Let I be a subset of points, of size k, and for each line \ell let k_\ell be the number of points of I contained in \ell. The probability that I is an independent set in the graph G obtained by removing all edges within A_\ell and B_\ell in the collinearity graph, for each line \ell, is equal to the probability that for every line \ell the set I \cap \ell is either a subset of A_\ell or B_\ell. Since all these are independent events, with an event corresponding to \ell having probability (1/2)^{k_\ell} + (1/2)^{k_\ell} = 2^{1 - k_\ell}, the probability that I is an independent set is equal to \prod_{\ell} 2^{1 - k_\ell} = 2^{m - \sum k_\ell} = 2^{m - (t + 1)k}. The probability that there exists an independent set of size k is then at most \binom{n}{k} 2^{m - (t + 1)k}, by the union bound. Therefore, to show the complementary event that there exists a bipartition of each line for which the graph defined above has no independent sets of size k, it suffices to show that this probability is < 1. By the inequality \binom{n}{k} \leq n^k = 2^{k \log n}, we see that it suffices to show

2^{k \log n + m - (t + 1)k} < 1.

This is true if and only if k > m/(t + 1 - \log n). Thus, for, say k = \lceil m/(t - \log n) \rceil, we get R(F, k) \geq n as there exists an F-free graph on n vertices with independence number < k.

This shows that to get a good lower bound on R(F, k), what we want from our partial linear space of order (s, t) on n points and m lines is \log n << t and n >> m, beyond the condition we started with of all copies of F being contained in the (s + 1)-cliques.

If you are still with me right now, then here is a concrete example. We know that any triangle in the collinearity graph of a generalized quadrangle (or see this) must be contained in the clique corresponding to a line. We also know that a generalized quadrangle (GQ for short) of order (s, t) has n = (s + 1)(st + 1) points and m = (t + 1)(st + 1) lines. For n >> m, we would thus want s >> t and for t >> \log n we would want t to not be a constant. Considering this, the best choice that we have with us here is s = t^2, and t \rightarrow \infty, as we know that in any GQ one always has s \leq t^2. Such GQ’s are known to exist whenever t is a prime power (see Chapter 5 of my lecture notes). From them we get the lower bound R(3, k) \geq (1 + o(1))k^{4/3}, which is not that great considering we know R(3, k) = \Theta(k^2/\log k). But there is some good news, as Mubayi and Verstraete have shown. If we consider generalized hexagons of order (q^3, q), then we get

R(C_5, k) \geq (1 + o(1))k^{11/8},

and from generalized octagons of order (q^2, q) we get

R(C_7, k) \geq (1 + o(1)) k^{11/9}.

Both of these bounds are better than what one would get from purely probabilistic arguments.

I hope that by using other finite geometries, or by improving the argument above, we would be able to improve bounds on other Ramsey numbers as well.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry, Ramsey Theory | Tagged , , , , , , , , | Leave a comment

Kopparty’s graph

Alon’s construction of optimal pseudorandom graphs from 1994 is useful for obtaining several interesting combinatorial results in various areas of mathematics, some of which are highlighted in this survey of Noga from last year (also see my previous post). In this post I would like to discuss an alternate construction, due to Swastik Kopparty, that has the same properties and thus can be used instead to prove all of these interesting results. It seems like this construction is not so well known (I learned about it from a conversation with Noga), even though it is a bit simpler to describe and in some sense more general. The construction appears in Kopparty’s lecture notes on Cayley Graphs.

What we are looking for is an infinite sequence of triangle-free d-regular graphs on n vertices with d = \Theta(n^{2/3}) and the second largest eigenvalue equal to \Theta(n^{1/3}).

Let p \neq 3 be a prime (it will become clear later why we are excluding 3), and let \mathbb{F}_{q} be a finite field of characteristic p, that is q = p^h for some integer h. For a subset S of the vector space V = \mathbb{F}_q^3 we define the Cayley graph G = Cay(V, S) as the graph on V with u \sim v if u - v \in S. This graph is undirected if for every s \in S we have -s \in S and loopless if 0 \not \in S, both of which we will assume. Kopparty’s construction, much like Alon’s construction, is a Cayley graph of this form for some suitably chosen S. The property of S that ensures that G is triangle-free is that s_1 + s_2 + s_3 \neq 0 for any s_1, s_2, s_3 \in S (not necessarily distinct). This explains why we assume p \neq 3, as otherwise s + s + s = 0 for every s \in S. Consider the following set:

S = \{(xy, xy^2, xy^3) : x \in T, y \in \mathbb{F}_q^{\times}\}

where T is a subset of \mathbb{F}_q with the property that no three elements of T sum to 0, that is, the same property that we wanted from S. Any such T implies triangle-freeness of G: if x_1(y_1, y_1^2, y_1^3) + x_2(y_2, y_2^2, y_2^3) + x_3(y_3, y_3^2, y_3^3) = 0, then either y_1 = y_2 = y_3 and we have a contradiction because of the property of T or we get a non-trivial linear combination of the rows of a Vandermonde matrix equal to 0, which is also a contradiction (note that 0 \not \in T and y_i \neq 0 for all i).

The specific choice of T will soon become clear, but for now let us focus on the degree and eigenvalues of G. The degree is equal to |S| = |T|(q - 1), while the number of vertices is q^3 (thus we would later require |T| = \Theta(q)). The eigenvalues of Cayley graphs, at least for the case of finite abelian groups, are easy to describe. I recommend chapter 5 of the lecture notes of Luca Trevisan for the basic theory. For a = (a_1, a_2, a_3) \in V, the eigenvalue \lambda_a corresponding to the eigenvector \chi_a : V \rightarrow \mathbb{C} defined by \chi_a(x_1, x_2, x_3) = \omega^{\mathrm{Tr}(a_1 x_1 + a_2 x_2 + a_3 x_3)} satisfies \lambda = \sum_{x \in S} \chi_a(x), where \mathrm{Tr} is the absolute trace function defined by \mathrm{Tr}(\alpha) = \alpha + \alpha^p + \cdots + \alpha^{p^{h - 1}}, and \omega = e^{2 \pi i/p} is a primitive p-th root of unity.

The eigenvalue corresponding to (0, 0, 0) is d = |S|, which is the largest eigenvalue. So let a = (a_1, a_2, a_3) \in V \setminus \{0\}. We have

\lambda_a =  \sum_{y \in \mathbb{F}_q^\times} \sum_{x \in T} \omega^{\mathrm{Tr}(x f(y))},

where f(y) = a_1 y + a_2y^2 + a_3 y^3. We now make the choice of T = \{u \in \mathbb{F}_q : \mathrm{Tr}(u) \in \{-1, 1\}\}. Note that |T| = 2q/p, unless p = 2 when $|T| = q/p$, T = -T and no three elements of T can sum to 0, as p \neq 3. From the following lemma, whose proof is a fun exercise in algebra, it follows that several of the terms in the expression for \lambda_a are 0.

Lemma. Let z \in \mathbb{F}_q \setminus \mathbb{F}_p. Then for every k \in \mathbb{F}_p we have \sum_{x \in \mathrm{Tr}^{-1}(k)} \omega^{\mathrm{Tr}(x z)} = 0.

Thanks to this Lemma we can write

\lambda_a = \sum_{y \in f^{-1}(\mathbb{F}_p) \setminus \{0\}} \sum_{x \in T} \omega^{\mathrm{Tr}(xf(y))}.

For any fixed k \in \mathbb{F}_p, the number of y for which f(y) = a_1 y + a_2 y^2 + a_3 y^3 = k, is at most 3 since it is an equation of degree at most 3 (and at least $1$) in one variable. Therefore, the number of summands is at most 3p|T|, which implies that |\lambda_a| \leq 3p|T|.

To summarise, we have a triangle-free graph on q^3 vertices which is d-regular with d = 2q(q - 1)/p and has all eigenvalues other than d at most 6 q in absolute value. For fixed p and h \rightarrow \infty, we get the infinite family that we were looking for.

Remark 1: Other choices of T will work as well, for example T = \{u \in \mathbb{F}_q : p/3 < \mathrm{Tr}(u) < 2p/3\}.

Remark 2: This can be generalised to an odd-cycle free construction, just like Alon’s graph can be generalised. Any generalisation to even-cycle free constructions, or to K_s-free, with s \geq 4, would be very interesting!

Posted in Combinatorics, Extremal Combinatorics, Spectral Graph Theory | Tagged , , , , , | 3 Comments

Ramsey numbers from pseudorandom graphs

One of the foundational results in modern combinatorics is Ramsey’s theorem which states that for every positive integers s, t there exists a constant n_0 such that for all n \geq n_0, every 2-coloring of edges of the complete graph K_n has a monochromatic copy of either K_s or K_t. The smallest n_0 for which the statement holds true is then known as the Ramsey number R(s, t); that is, for all n \geq R(s, t) the statement above is true and there exists a 2-coloring of the edges of the complete graph on R(s, t) - 1 vertices that avoids monochromatic K_s and K_t. We know the exact values of R(s, t) only for some small values of s and t. A famous example is R(3,3) = 6, while despite decades of research we only known that 43 \leq R(5,5) \leq 48.

Determining the Ramsey numbers for larger values has been widely open since 1940’s. It has become one of the central problems of mathematics to determine the asymptotics of R(t, t) with t \rightarrow \infty and R(s, t) with s fixed and t \rightarrow \infty. An easy upper bound follows from a nice inductive prove due to Erdős and Szekeres (1947), and it shows that R(s, t) \leq \binom{s + t - 2}{s - 1}. In particular, this implies R(t,t) = O(4^t/\sqrt{t}) and R(s, t) = O(t^{s - 1}) (with the constant depending on s). The best upper bounds that we know are not that far away from these. David Conlon proved in 2016 that there exists a constant c for which

R(t, t) \leq \frac{4^t}{t^{-c \log t/ \log \log t}} .

Ajtai, Komlós and Szemerédi proved in 1980 that for every fixed s there exists a constant c_s such that

R(s, t) \leq c_s \frac{t^{s - 1}}{(\log t)^{s - 2}}

For lower bounds, Erdős proved using a probabilistic argument that

R(t, t) \geq (1 - o(1)) \frac{t}{\sqrt{2} e} \sqrt{2}^t

and the only improvement (since 1947) in that lower bound has been by a factor of 2.

In case of R(s, t), with s fixed, we know the following lower bound by Bohman and Keevash from 2010 that is again proved probabilistically by carefully analysing a natural random K_s-free process:

R(s, t) \geq c'_s \frac{t^{(s + 1)/2}}{(\log t)^{(s + 1)/2 - 1/(s - 2)}}.

Ignoring the log factors in the bottom, we have

c'_s t^{(s + 1)/2 - o(1)} \leq R(s, t) \leq c_s t^{s - 1 - o(1)} .

Any improvement in the powers of t here would be ground-breaking. What I want to talk about in this post is a recent work of Dhruv Mubayi and Jacques Verstraete that offers us a new way of possibly improving the lower bound. Briefly, they show that if the main conjecture on pseudorandom graphs that I mentioned in an earlier post is true, then R(s, t) \geq c''_s t^{s - 1 - o(1)} for some constant c''_s. Therefore, the existence of the optimal K_s-free pseudorandom graphs would give us the correct polynomial in R(s, t).

Here is how their proof goes. Firstly, observe that by taking one of the colour classes to define the edges of a graph, we need to construct an n vertex graph G that has no K_s as subgraphs and has independence number < t to show R(s, t) \geq n + 1. The main result that they prove is as follows:

Theorem 1. Let If there exists a K_s-free (n, d, \lambda)-graph, and n is large enough, then for t = \lceil 2n \log^2 n/d \rceil, we have

R(s, t) \geq c_s \frac{n}{\lambda} \log^2 n .

for some constant c_s.

The proof of Theorem 1 is not too hard if you are comfortable with the probabilistic method and you know that you should use Theorem 2.1 from Alon-Rödl. It can be summarised as: “pick each vertex uniformly at random with probability \log^2 n/(2e^2 \lambda) and in the induced subgraph remove one vertex from each independent set of size t”.

If the conjecture regarding K_s-free pseudorandom graphs is true, then for every integer s \geq 3 there exists an infinite sequence of (n, d, \lambda)-graphs with d = \Omega(n^{1-1/(2s - 3)}) and \lambda = O(\sqrt{d}). Consider such a graph G for n large enough. From t = \lceil 2n \log^2 n/d \rceil and n = O(d^{(2s - 3)/(2s - 4)}) we get d = \Omega(t^{2s - 4}/(\log t)^{2(2s - 4)}). Since \lambda = O(\sqrt{d}), from Theorem 1 we have R(s, t) \geq c_s \frac{n}{\lambda} \log^2n = c_s t \frac{d}{2 \lambda} = \Omega(t \sqrt{d}), and from the previous expression for d in terms of t we get

R(s, t) = \Omega \left(\frac{t^{s - 1}}{\log^{2s - 4} t} \right).

So, all we need to do is construct these K_s-free pseudorandom graphs. The best construction that we have so far (discussed here) has edge density \Theta(n^{-1/(s - 1))}. Sadly, the lower bound that this construction gives us on R(s, t) is c_s t^{s/2}/(\log t)^{s - 2}, which is worse than the probabilistic bound of Bohman and Keevash. In general, if we can construction K_s-free optimally pseudorandom graphs with density n^{-1/f(s)}, we will get the bound

R(s, t) \geq c_s \frac{t^{(f(s) + 1)/2}}{\log^{f(s) - 1} t}

Therefore, even a construction with density n^{-1/(s + 1)} (or in fact n^{-1/(s+\epsilon)}) would already break the Bohman-Keevash lower bound! That should be doable, right? I leave it as an open problem to the readers.

UPDATE 15th Oct 2019: Xiaoyu He has shown that the existence of these optimal K_s-free graphs for all s would also determine the multicolour Ramsey numbers R(s_1, \dots, s_k, t) with s_i fixed for all i and t \rightarrow \infty, up-to a poly-logarithmic factor.

Posted in Combinatorics, Extremal Combinatorics | Tagged , , , , , , | 2 Comments

Spectral proofs of theorems on the boolean hypercube

In a recent breakthrough Hao Huang proved the sensitivity conjecture, that had remained open for past 30 years despite some serious effort from various computer scientists and mathematicians. The proof can be described in a single tweet (if you have the basic background), and it has appeared on the ELI5 page of reddit. The main ideas behind the proof are from spectral graph theory, and one of the starting points (see for example this comment) is the following bound on the independence number of a graph due to Cvetković:

Lemma 1 (Inertia Bound). Let G be a graph on N vertices and let \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_N be its eigenvalues. Let N^+ be the number of non-negative eigenvalues of G, and N^- the number of non-positive eigenvalues (counted with multiplicity). Then \alpha(G) \leq \min \{N^+, N^-\}.

The proof of this Lemma is a nice application of a classical result in mathematics known as Cauchy’s interlacing theorem (I have talked about eigenvalue interlacing earlier in this blog), and I leave it to the reader to fill in the details*. This spectral upper bound on the independence number of a graph is relatively less known than Hoffman/Delsarte bound which says \alpha(G) \leq (-\lambda_N/(\lambda_1 - \lambda_N)) N. Since many combinatorial problems can be rephrased as finding a nice upper bound on the independence number of some graph, such spectral results can be quite useful, when one has enough structure to say something about the eigenvalues. Until recently, I didn’t see any applications of the inertia bound in extremal combinatorics, whereas Hoffman’s bound is widely used; for example, in proving Erdős-Ko-Rado theorems. Interestingly, the classical Erdős-Ko-Rado theorem can also be proved using the Lemma 1 (as I learned from Hao Huang’s talk at our combinatorics seminar).

One interesting observation made by Huang regarding Lemma 1 (which was already known) is that instead of using the eigenvalues of the adjacency matrix of the graph, we could use any pseudo-adjacency matrix and still arrive at the same conclusion. A pseudo-adjacency matrix of a graph G on N vertices is any N \times N real symmetrix matrix A with the property that A_{ij} = 0 whenever the i-th vertex is non-adjacent to the j-th vertex (in an adjacency matrix we also require A_{ij} = 1 for every adjacent pair). If N^+, N^- are the number non-negative/positive eigenvalues of A, then from the proof of Lemma 1 using interlacing we again get \alpha(G) \leq \min \{N^+, N^-\}.

The flexibility of using the pseudo-adjacency matrix is crucial to Huang’s breakthrough result. In fact, before the proof of the sensitivity conjecture, Huang (in collaboration with Oleksiy Klurman and Cosmin Pohoata) used the same method to give a spectral proof of another theorem on the hypercube, known as Kleitman’s theorem, which is what I want to discuss in this blog post. In particular, I want to give a slightly simplified proof due to Aditya Potukuchi and Amey Bhangale, that my friend Aditya shared with me a few days after Huang’s sensitivity paper appeared on arXiv as an application of his methods, without realising that a spectral proof of this theorem had already appeared in an earlier paper of Huang. Here is the theorem that we will prove:

Theorem 2 (Kleitman). Let S be a set of vectors in \{0, 1\}^n such that the hamming distance between any two vectors in S is at most d. Then

\displaystyle |S| \leq \begin{cases} \sum_{i = 0}^t \binom{n}{i}, \text{ if } d = 2t \\ 2(\sum_{i = 0}^t \binom{n - 1}{i}), \text{ if } d = 2t + 1\end{cases}.

(This can be seen as the discrete analog of the isodiametric theorems in real geometry where one wants to find the largest volume of a set of points with a given diameter.)

To use Lemma 1 above, we will naturally consider the graph G on Q_n = \{0, 1\}^n where two vectors are adjacent if the hamming distance between them is at least d + 1, because then any independent set in this graph corresponds to such a set S in the theorem. What are the eigenvalues of the graph G?

The boolean hypercube, Q_n, and the graphs defined on it using hamming distance are well studied objects. One way of computing the eigenvalues of G is to look at it as the Cayley graph Cay(\mathbb{Z}_2^n, \mathbb{Z}_2^n \setminus B_d^n), where B_d^n is the Hamming ball of radius d, that is, all vectors whose Hamming weight is at most d. From the basic theory of Cayley graphs, it then follows that G has n + 1 distinct eigenvalues \lambda_0, \lambda_1, \dots, \lambda_n with

\displaystyle \lambda_i = \sum_{k = d + 1}^n \mathcal{K}_k^n(i)

appearing with multiplicity \binom{n}{i}, where \mathcal{K}_k^n(i) is the Kravchuk polynomial evaluated at i, defined as

\displaystyle \mathcal{K}_k^n(x) = \sum_{j = 0}^k (-1)^j \binom{x}{j} \binom{n - x}{k - j}.

Note that \mathcal{K}_k^n(x) is a polynomial of degree k. For ease of notation, we will skip the superscript from now on and just write \mathcal{K}_k instead. These polynomials have several nice properties, and we will use the following symmetry (or should it be called anti-symmetry?):

\displaystyle \mathcal{K}_k(i) = (-1)^i \mathcal{K}_{n - k}(i)

which lets us see the evaluation of a degree k polynomial at i as the evaluation of a degree n - k polynomial at i, up-to a factor of (-1)^i. This fact will come in handy later.

Now if you apply Lemma 1 to G using these eigenvalues, you will realise (after some calculation) that it does not give the required bound (what bounds do you get?). That’s where pseudo-adjacency matrices come into picture. We will define a weight function w:[n] \rightarrow \mathbb{R}, such that every vector with i non-zero values is given the weight w(i), and w(i) = 0 for all i \leq d. From this weight function we will get the pseudo-adjacency matrix A of G where A_{ij} = w(k) if the Hamming distance between the i-th and the j-th vectors is equal to k. Again, from the basic theory of Cayley graphs (in particular, calculating Fourier coefficients), or more directly (Lemma 2.3 here), we get that A has n + 1 distinct eigenvalues,

\displaystyle \lambda_i = \sum_{k = d + 1}^n w(k) \mathcal{K}_k(i)

with multiplicity \binom{n}{i}, with i = 0, 1, \dots, n. Now, we just have to choose the function w cleverly so that either N^+, or N^-, is at most what we need it to be. Here is when Huang, Klurman and Pohoata make the clever choice of w(k) = \binom{\lfloor (k - 1)/2 \rfloor}{t}, for the case of d = 2t. For the odd case, d = 2t + 1, there is a different choice, but from now on we will only stick to the even case of Theorem 2. From this choice, they get the following three properties,

  • (-1)^i \lambda_i > 0, for i = 0, \dots, t
  • \lambda_{n - i} = \lambda_{i + 1}, for i = 0, \dots, t - 1
  • \lambda_{t + 1} = \lambda_{t + 2} = \dots = \lambda_{n - t} = (-1)^{t + 1},

from which the result then follows by an easy analysis. The way they prove these properties is using generating functionology, and it is not immediately clear how the choice of w(k) came about. What Aditya and Amey do instead, is start with the properties that one wants from the eigenvalues, and then show the existence of the function w, without ever computing it explicitly!

Here is how this is done. Observe that using the symmetric property of Kravchuk polynomials, we can write

\displaystyle \lambda_i = \sum_{k = 2t + 1}^n w(k) \mathcal{K}_k(i) = (-1)^i \sum_{k = 2t + 1}^n w(k) \mathcal{K}_{n - k}(i)

In the second formulation, we can see \lambda_i as (-1)^i multiplied with a linear combination of polynomials of degrees n - 2t - 1, n - 2t - 2, \dots, 1, 0, evaluated at i. These polynomials are in fact linearly independent, and hence any polynomial f(x) of degree at most n - 2t - 1 can be written as a linear combination of them. Therefore, we can choose a nice f(x), for which the eigenvalues will be \lambda_i = (-1)^i f(i), and then get the weights w(k)‘s automatically.

Pick f(x) as the unique monic polynomial whose roots are t + i + 0.5, for i = 1 to n - 2t - 1. WLOG, assume that f(0) > 0 (if not, then we can take -f). Since f is a polynomial, the sign of its values oscillates between the roots. In particular, this shows that \lambda_i = (-1)^i f(i) has the same sign for i = t + 1, \dots, n - t, that is, \lambda_i > 0 if t is odd (since f(0) and f(t+1) have the same sign), and \lambda_i < 0 if t is even, for all of these i‘s. In the former case, we can carefully count the number of negative eigenvalues with multiplicities, and in the latter case we can count the number of positive eigenvalues with multiplicities, giving us Theorem 2.

Note that as far as the signs of \lambda_i‘s are concerned, this trick of picking the polynomial f explicitly, and then getting the weights, gives us the same three properties as we had earlier where Huang, Klurman and Pohoata made the explicit choice of the weight function. For the case of Theorem 2 when d = 2t + 1, we have to do something different, and I leave it to the reader to figure that case out.

For some open questions that can potentially be solved using these methods, check out Section 5 of the Huang-Klurman-Pohoata paper.

(*) Fedor Petrov has given a “low-level” proof that avoids using the interlacing theorem (congrats to Fedya for his new blog!).

For discussions on Huang’s proof of the sensitivity conjecture, see the following blog posts: Terry Tao, Gil Kalai, Scott Aaronson, Ken Reagan.

Posted in Combinatorics, Extremal Combinatorics, Spectral Graph Theory | Tagged , , , , , , , | 5 Comments

Pseudorandom clique-free graphs

Pseudorandom graphs are graphs that in some way behaves like a random graph with the same edge density. One way in which this happens is as follows. In the random graph G(n, p), with p = p(n) \leq 0.99, a direct application of Chernoff bound implies that the probability of the following event approaches 1 as n approaches infinity:

|e(S, T) - p|S||T|| = O(\sqrt{pn |S||T|})

where S,T are arbitrary subsets of vertices and e(S,T) denotes the number of edges with one end vertex in S and the other one in T.  Note that p|S||T| is the expected number of edges between S and T in this model, and \sqrt{p(1 - p)|S||T|} is the standard deviation. Now let G be a d-regular graph on n-vertices and let \lambda be the second largest eigenvalue of G in absolute value (these are referred to as (n, d, \lambda)-graphs). Then the following is true for any two subsets S, T of vertices:

|e(S,T) - (d/n) |S||T|| \leq \lambda \sqrt{|S||T|}.

where \lambda is the second largest eigenvalue of the adjacency matrix of the graph, in absolute value. Therefore, if \lambda is small, and in particular close to being O(\sqrt{d}), then the graph mimics the behaviour of G(n, d/n). In fact, for any (n, d, \lambda)-graph, with d < n/2, one can show by looking at the square of the adjacency matrix that \lambda = \Omega(\sqrt{d}). The graphs where \lambda = \Theta(\sqrt{d}) are known as optimally pseudorandom.

Pseudorandom graphs have found several applications over the last few decades, and there are many interesting questions about them (see the survey of Krivelevich and Sudakov). In a 1994 paper, Noga Alon constructed a family of optimally pseudorandom triangle free graphs on n-vertices, with d/n = \Omega(n^{-1/3}), that he then used to give explicit bounds on some Ramsey numbers, and to show that the maximum possible Euclidean norm of n unit vectors in \mathbb{R}^n with the property that among any three of them two are orthogonal, is equal to \Theta(n^{2/3}). More applications of this construction can be found in the recent survey paper of Noga based on a talk he gave at the conference celebrating the 70th birthday of László Lovász (which is where I learned about these graphs, and the main topic of this post).

Alon’s construction is in fact optimal in the sense that there existence a constant C > 0, such that any optimally pseudorandom graph on n vertices with d/n > Cn^{-1/3} must contain a triangle. This can be generalised to cliques of size k, where we have a constant C > 0 such that any optimally pseudorandom (n, d, \lambda)-graph with d/n > C n^{-1/(2k - 3)} must contain K_k. The proof of this follows from greedily picking common neighbours, using the following lemma (that can be proved using the edge distribution proposition above):

If S is a set of vertices with |S| \geq 2n\lambda/d, then there are at least d|S|^2/4n edges with both end points in S.

The natural question now is to give matching constructions for this bound for all k > 3, or improve the bound. This question has been asked by several people, as it arises naturally in many situations, but it has been open for every k > 3 since the past 20 years or so. David Conlon has called it “one of the outstanding open problems about pseudorandom graph” (also check out this video).

The best known construction so far for larger values of k was the construction of Alon and Krivelevich that gives us optimally pseudorandom graphs with edge density d/n = \Theta(n^{-1/(k - 2)}. Note that this starts giving a better construction than Alon’s triangle free graphs at k = 6. In my recent work with Ferdinand Ihringer and Valentina Pepe, we have been able to provide a better construction, that gives a family of graphs with edge density d/n = \Theta(n^{-1/(k - 1)}). The construction is fairly easy to describe, and for a proof you can refer to the paper:

Let Q(x_1, \dots, x_k) = x_1^2 + \xi x_2^2 + \sum_{i = 3}^k x_k^2 be a quadratic form over \mathbb{F}_q where \xi is a non-square in \mathbb{F}_q (we assume q to be odd). Define a graph with vertices as the 1-dimensional subspaces x of \mathbb{F}_q^k, for which Q(x) is a square, and making two vertices x and y adjacent if they are orthogonal to each other, that is,
\frac{1}{2}(Q(x + y) - Q(x) - Q(y)) = x_1 y_1 + \xi x_2y_2 + \sum_{i =3}^n x_i y_i = 0.

Then this graph is a K_k-free (n, d, \lambda)-graph with n = (1 + o(1))q^{k-1}/2, d = (1 + o(1))q^{k - 2}/2 and \lambda = \Theta(q^{(k - 2)/2}).


While we now have slightly better constructions, we are still far away from the conjectured bound. I am hopeful though that finite geometry, and especially the geometries associated with quadratic forms (known as polar spaces), can play an important role in obtaining even better constructions. In fact, there is a construction due to Kopparty  for triangle-free graphs matching the parameters of Alon’s construction that essentially comes from a generalized quadrangle (which is a polar space), if you look at it carefully (see my earlier post for hints on that). Moreover, Conlon was able to give an (almost) optimal probabilistic construction which also uses generalized quadrangles. This a strong indication that polar spaces can be useful in general. As a first step, we should perhaps try to obtain optimally pseudorandom K_4-free graphs that have higher edge density than n^{-1/3}.

Edit 04/09/2019: An exciting new result by Mubyai and Verstraete shows that if the main conjecture of this post is true, that is, there exists K_k-free pseudorandom graphs of density n^{-1/(2k - 3)}, then this would asymptotically determine off-diagonal Ramsey numbers R(k, t), with k fixed and t \rightarrow \infty. In fact, even an improvement of the denominator from k - 1 (which is in our construction) to k + 1 would be a big development as it’ll improve the current lower bounds on Ramsey numbers. See the first half of this talk of Mubayi.

Posted in Combinatorics, Extremal Combinatorics, Spectral Graph Theory | Tagged , , | 3 Comments

Ryser’s conjecture

I am on a research visit in Rome, working with Valentina Pepe, and our joint paper on Ryser’s conjecture is on arXiv now. So this seems like the right time to talk about the conjecture and the problems related to it that I have been obsessing over for past few months.

In graph theory, the classical result of Kőnig says that in any (finite) bipartite graph the minimum number of vertices that cover all edges (the so-called vertex cover number) is equal to the maximum number of pairwise disjoint edges in the graph (matching number). This result is clearly not-true for non-bipartite graphs (think of some obvious counter examples), where the best one can do is have the vertex cover number equal to twice the matching number since the vertices contained in any maximum matching must cover all the edges. Ryser’s conjecture is a proposed generalisation of this result to r-partite hypergraphs, that is, for hypergraphs whose vertex set can be partitioned into r parts (let’s call them sides from now on), such that each edge of the hypergraph contains a unique vertex from each of the sides. In particular, every r-partite hypergraph is r-uniform, that is, each edge has exactly r vertices in it. If we have any r-uniform hypergraph \mathcal{H} (not necessarily r-partite), then the following follows as before, \tau(\mathcal{H}) \leq r \nu(\mathcal{H}), where \tau denotes the vertex cover number and \nu the matching number. Ryser’s conjectured that if \mathcal{H} is r-partite, then we must have \tau(\mathcal{H}) \leq (r - 1) \nu(\mathcal{H}). This conjecture first appeared in the Ph.D. thesis of Ryser’s student, J.R. Henderson, and it has often been misattributed to a 1967 paper of Ryser (see this).

Despite the time span of about 50 years, our current knowledge about this conjecture is abysmal. The only other case of this conjecture which is known to be true in general is r = 3, which was proved by Aharoni using topological methods! If we impose some further restrictions on the hypergraph, then we know a bit more, but not much. The conjecture is true for intersecting hypergraphs (matching number equal to 1) if r \leq 5, as proved by Tuza, and for r \leq 9 if the hypergraph is also linear, as proved in the recent paper by Francetić, Herke, McKay and Wanless.

In view of our inability to prove this conjecture, some natural questions to ask are, (1) is it even true?, and (2) why is it so hard to prove?. While it’s quite possible that the conjecture is false, let’s focus on (2) for now. Sometimes what makes an extremal problem in combinatorics hard to prove is that there are many different kinds of extremal examples, and a “combinatorial” proof must somehow consider all of these examples (I should thank Tibor Szabó for this intuition). So let’s see what we know about hypergraphs meeting the bound in Ryser’s conjecture.

Until recently, the only known family of r-partite hypergraphs with vertex cover number equal to r - 1 times the matching number, called r-Ryser hypergraphs, came from finite projective planes of order r - 1 (which is what sparked my interest in this problem), and hence they were known to exist whenever r - 1 is equal to a prime power (2, 3, 4, 5, 6, 8, 9, 10, \dots). Once you know the definition of projective planes, the construction is easy: remove a point and all lines through it. These hypergraphs are hence known as truncated projective planes. Note that this gives us intersecting r-Ryser hypergraphs, and to get such hypergraphs with matching number \nu one can simply take \nu disjoint copies of the intersecting hypergraphs (more on this later!). Inspired by the lack of examples, several people gave constructions for small values of r where projective planes did not exist (or were not known to exist), and then Abu-Khazneh, Barát, Pokrovskiy and Szabó, came up with a clever construction which gave a new infinite family of intersecting r-Ryser hypergraphs whenever r - 2 is a prime power. In fact, they were able to construct many non-isomorphic examples of such hypergraphs! Note that while it is known that there are plenty of non-isomorphic projective planes of a given order, it is not clear what the rate of growth of this function is, and that’s a fascinating problem on its own. Another interesting family of r-Ryser hypergraphs was obtained by Haxell and Scott, whenever both (r - 1)/2 and (r + 1)/2 are prime powers (whether this gives infinitely many new values of r for which we have a Ryser hypergraph or not is in fact related to a nice open problem in number theory, which a careful reader should be able to deduce :)). Both of these constructions rely on finite projective or affine planes.

Valentina and I have constructed new infinite families of non-intersecting r-Ryser hypergraphs, whenever r - 1 is a prime power bigger than 3, which looks fundamentally different from just taking disjoint copies of intersecting Ryser hypergraphs. The condition on r being at least 4 cannot be relaxed since a result of Haxell, Narins and Szabo, says that every 3-Ryser hypergraph is essentially obtained by taking disjoint copies of intersecting 3-Ryser hypergraphs. For r = 4 it was already known that such a characterisation cannot be true, because of a computer-generated example of Abu-Khazneh. Our constructions show that for any r \geq 4 and \nu > 1, such that r - 1 is a prime power, there exists an r-Ryser hypergraph with matching number equal to \nu which does not contain two vertex disjoint copies of intersecting r-Ryser hypergraphs.

The constructions are again finite geometric, and we were quite happy about the fact that many non-trivial results on blocking sites of finite projective planes came into play when proving that these hypergraphs are Ryser extremal. Here is a description of the first family, with \nu = 2, along with a picture:


Let \pi_1 and \pi_2 be two copies of classical projective planes of order q with a common point P, which will be truncated at the the points Q_1 and Q_2. Let \mathcal{C} be a conic in the second plane passing through q, and v an extra vertex. The edges of the hypergraph are (1) all lines in the first plane not through Q_1 or P, and a line \ell through P which does not contain Q_1; (2) all the lines of the second projective plane not through Q_2 that contain a point of the conic \mathcal{C}; (3) two new (weird) edges e_1 and e_2 with e_1 = \ell \setminus \{P\} \cup \{R\} and e_2 = C \setminus \{Q_2\} \cup \{v\}.

The fact that this is a (q + 1)-Ryser hypergraph of matching number 2 with the required properties, whenever q is an odd prime, is proved in the paper. For other values of q, there is a second, more involved, construction (which also comes with a picture!).

Our new constructions show that the non-intersecting Ryser hypergraphs can have a richer structure, and perhaps it’ll be useful to construct more such hypergraphs for either disproving the conjecture or understanding the extremal cases better. Some other interesting questions, that have also been mentioned before by others, are as follows:

Open problem 1: Find the minimum number of edges an r-Ryser hypergraph can have. It is not known, but conjectured, that linearly many edges should suffice.

Open problem 2: What is the largest vertex cover number of an r-partite intersecting hypergraph, if the “trivial” covers containing a side or an edge are not allowed?

This seems to be related to the problem of finding the smallest non-trivial blocking set in finite projective planes.

Open problem 3: Prove, or disprove!, Ryser’s conjecture.

Posted in Combinatorics, Extremal Combinatorics, Finite Geometry, Incidence Geometry | Tagged , , , , , , , , | 2 Comments