## Bilinear forms and diagonal Ramsey numbers

The recent breakthrough of Conlon and Ferber has shown us that algebraic methods can be used in combination with probabilistic methods to improve bounds on multicolour diagonal Ramsey numbers. This was already shown for the off-diagonal Ramsey numbers by Mubayi and Verstraete last year, as discussed here. Sadly, for two colours we still don’t have any improvement over the classic probabilistic bound proved by Erdős in 1947 (except for some constant factors in front of the exponential function). More concretely, we haven’t been able to prove $R(t, t) > c^t$, for any $c > \sqrt{2}$ since 1947, despite considerable effort by various mathematicians. Therefore, it might sound reasonable to think that $R(t, t)$ is equal to $2^{t/2 + o(t)}$, but the exponential improvement over the (purely ) probabilistic lower bounds on multicolour Ramsey numbers should cast some doubt on that.

In an end remark Conlon and Ferber mention that the lower bound of $2^{t/2 + o(t)}$ can be proved in a new way, using the arguments of their paper, where we have a mix of algebra and probability. This sounds like a very promising approach for further improvements as well. In this post I want to explore that, and attempt to explain why we will need some really new ideas to improve the lower bound. In the process, we will see some basic theory of bilinear forms, which is crucial to the arguments. For a more in depth exploration of the topic, see these notes of Keith Conrad and Chapter 3 of the book by Simeon Ball.

Given a vector space $V$ over a field $F$, a bilinear form is a function $b: V \times V \rightarrow F$, such that for any fixed $u, v \in V$, both $b(u, y) : V \rightarrow F$ and $b(x, v) : V \rightarrow F$ are linear functions. This is an abstraction of the dot product, which is so crucial in understanding the geometry of the real space $\mathbb{R}^n$. Inspired by that, we define a vector $u$ to be orthogonal to a vector $v$ if $b(u, v) = 0$. There is a certain lack of symmetry in the definition here, namely, $b(u, v) = 0$ doesn’t necessarily imply $b(v, u)$, or in other words, we could have $u$ orthogonal to $v$ but $v$ non-orthogonal to $u$. To fix that we introduce the notion of reflexivity and call a bilinear form reflexive if $b(u, v) = 0$ implies $b(v, u) = 0$ for all $u, v \in V$. Once have have a reflexive bilinear form we can talk about orthogonality of two vectors. From now on, $b$ will denote a reflexive bilinear form.

While this orthogonality defined in terms of reflexive bilinear forms might seem like a very general notion, it ends up being restricted to only two cases. A fundamental result, attributed to John von Neumann, states that for any vector space $V$, a reflexive bilinear form $b$ on $V$ is of one of the following two types:

1. Symmetric: $b(u, v) = b(v, u)$ for all $u, v \in V$.
2. Alternating: $b(u, u) = 0$ for all $u$.

For example, the dot product on $\mathbb{R}^n$ defined by $(x_1, \dots, x_n) \cdot (y_1, \dots, y_n) = x_1y_1 + x_2y_2 \cdots x_ny_n$ is a symmetric bilinear form, whereas the function $b(x, y) = x_1y_2 - x_2y_1 + x_3y_4 - x_4y_3 + \cdots x_{n-1} y_n - x_n y_{n - 1}$ (assuming $n$ to be even) is alternating. These examples are in fact examples of what we call non-degenerate bilinear forms, i.e., form with respect to which no vector is orthogonal to every vector of the vector space.

One interesting property of alternating forms is that they are skew symmetric. Indeed, we have $b(u + v, u + v) = 0$ for any $u,v$ and thus $b(u,v)=-b(v,u)$. If the characteristic of the underlying field $F$ is even, then this shows that every alternating form is in fact symmetric. The following lemma gives us a relation in the other direction as well.

Lemma 1: If $b$ is a symmetric bilinear form on a vector space $V$ over a field $F$ of characteristic $2$, then $V=S\oplus W$ where $\dim S \leq 1$ and the restriction of $b$ on $W$ is an alternating form.

For example, if we take $b(x, y)=x_1y_1 + \cdots x_n y_n$ over $\mathbb{F}_2^n$, then we can take $W = \{u \in \mathbb{F}_2^n : b(u,u) = 0\}$, which is equal to the hyperplane $\{u\in \mathbb{F}_2^n:\sum u_i = 0\}$, since $a^2=a$ for any $a \in \mathbb{F}_2$. The form restricted to $W$ is alternating by the definition of $W$.

One of the basic notions associated to a bilinear form is that of an isotropic subspace, which shows up in the construction of Conlon and Ferber as well, as discussed in the previous post. An isotropic subspace is a subspace $S$ such that $b(u, v) = 0$ for any two, not necessarily distinct, $u,v$ in $S$ (such a subspace is usually a totally isotropic but we’d ignore that here). For an arbitrary subset $A$ of $V$, if we denote the set $\{u\in V:b(u,v)=0\forall v\in A\}$ by $A^\perp$, then $S$ being isotropic is the same as saying $S \subseteq S^\perp$. Note that $A^\perp$ is always a vector subspace. The following is another fundamental result whose corollary is what we used in the previous post.

Lemma 2: Let $b$ be a non-degenerate reflexive bilinear form over a finite dimensional vector space $V$. Then for any subspace $S$, we have $\dim S + \dim S^\perp = \dim V$.

Corollary 3: An isotropic subspace with respect to a non-degenerate (reflexive) bilinear form over an $n$ dimensional vector space has dimension at most $n/2$.

The max dimension of an isotropic subspace is known as the Witt index of the bilinear form. It can be shown that every non-degenerate alternating form has Witt index exactly equal to $n/2$ (see Theorem 3.8 in Simeon Ball’s book). This also implies that there are no non-degenerate alternating forms over odd dimensional vector spaces.

Let us now look at the graph that plays the main role in the construction of Conlon and Ferber. Let $G$ be the orthogonality graph on the set of isotropic vectors in $\mathbb{F}_2^t$ with respect to the bilinear form $b(x, y) = x_1y_1 + \cdots x_t y_t$, with $t$ even. Then as we saw earlier, the set $S$ of isotropic vertices form a $t-1$ dimensional vector subspace on which $b$ is an alternating form. Moreover, this alternating form is degenerate because $(1, \dots, 1)$ is orthogonal to every other vector in $S$ since $t$ is even. As there is no other isotropic vector which is orthogonal to every other isotropic vector, the set vector space $S$ can be decomposed as $S = \langle (1, \dots, 1) \rangle\oplus \mathbb{F}_2^{t\text{-}2}$ such that the form induced on $\mathbb{F}_2^{t\text{-}2}$ is a non-degenerate alternating form. Therefore, we should be able to understand this graph starting from the non-degenerate alternating form. Let’s do that and start over.

Let $t$ be an even integer, and $b$ a non-degenerate alternating form on $\mathbb{F}_2^{t\text{-}2}$. Let $G$ be the orthogonality graph on these vectors, that is, $u \sim v$ if $b(u, v) = 0$. We can notice the following property.

Lemma 4: There are no independent sets of size $t$ in $G$.
Proof. Let $x_1, \dots, x_t$ be an independent set, and say $\sum a_i x_i = 0$. Then by applying $b$ on this sum and each $x_i$ we get that $(a_1, \dots, a_t)$ is in the null space of $J - I$. But $J - I$ has eigenvalues $t-1$ and $\text{-}1$, which are both non-zero (since $t$ is even), and thus $J - I$ is non-singular. Therefore, $a_i = 0$ for all $i$, and these vectors form an independent set in the vector space. This is a contradiction since the dimension of the vector space is strictly less than $t$.

While there can’t be any large independent sets, $G$ certainly has really large cliques. Any (totally) isotropic subspace of dimension $(t - 2)/2$ gives rise to a clique of size $2^{t/2 - 1}$, and in fact these are the largest possible cliques. So, trivially $G$ does not give any good lower bound on the Ramsey number $R(t, t)$. Still, we can apply the basic idea of Conlon and Ferber (which was also the main idea of Mubayi and Verstraete) and take a random induced subgraph of $G$, hoping that it can destroy all large cliques while giving us a graph of reasonable size. (Note that any induced subgraph will also be free of any independent sets of size $t$.) For that, let’s first count the total number of cliques of size $k$ in $G$.

Every clique in $G$ spans an isotropic subspace of $\mathbb{F}_2^{t-2}$, which has a certain rank $r$. The number $N_{k,r}$ of cliques of size $k$ and rank $r$ can be bounded from above by $2^{t-2} 2^{t - 3} \cdots 2^{t - 2 + r -1} 2^{(k - r)r}$, where the first $r$ terms correspond to choosing $r$ linearly independent members of the clique and then we pick the remaining $k - r$ elements from this $r$-dimensional subspace. By simplifying, we have

$N_{k, r} \leq 2^{(t+ k - 3(r+1)/2)r}$.

This number is increasing in $r$ from $r = 1$ until $r = (t + k)/3 - 1$, and then decreases till $r = 2(t + k)/3 -1$, where it finally becomes $1$ (unless $2(t + k)/3 - 1 > t - 1$, which is a trivial case that we’ll ignore). But, we know that $r \leq t/2$ from Corollary 3, and hence $\max_{r} N_{k, r} = N_{k, t/2 - 1}$ for $(t + k)/3 - 1 \geq t/2 - 1$, i.e., for $k \geq t/2$, and $\max_{r} N_{k, r} = N_{k, (t + k)/3 - 1}$ for $k \leq t/2$. While we are interested in $N_{t} = \sum_{r = 1}^{t/2 - 1} N_{t,r}$, we will soon need the estimates for smaller cliques as well.

Repeating the probabilistic argument of Conlon and Ferber, we see that if we sample each vertex independently with probability $p = n2^{-t + O(1)}$, then there exists a subset of vertices of size $n$ inducing no clique of size $t$ as long as we can ensure that $n^{t} 2^{-t^2 + o(t^2)} N_t < 1/2$. Since $N_t \leq \frac{t}{2} \max_{r} N_{t,r} = \frac{t}{2} N_{t, t/2} \leq 2^{5t^2/8 + o(t^2)}$, we can take $n = 2^{3t/8 + o(t)}$, which then gives us the bound $R(t, t) > 2^{3t/8 + o(t)}$ on the Ramsey number $R(t,t)$. This is worse than the purely probabilistic bound of $2^{t/2}$. Can we do anything better?

Here’s an idea that is extracted from the paper of Conlon and Ferber. We can look at cliques, and independent sets, of size smaller than $t$ in this graph. Perhaps for some $0 < a < 1$, we can get good lower bound on $R(at, at)$ by taking a random induced subgraph of $G$ that does not contain any cliques of independent sets of size $at$.

For that, let’s count the number $I_k$ of independent sets of size $k$ in $G$. For $k$ even, the proof of Lemma 4 tells us that any such set will also be an independent set of vectors. Therefore, $I_k$ is at most $\prod_{i = 0}^{k-1} 2^{t - i - 2}$ or $2 \prod_{i = 0}^{k-2} 2^{t - i - 2}$, depending on whether $k$ is even or odd. In any case, for $k = at$, we have $I_{at} \leq 2^{(2a - a^2)t^2/2 + o(t^2)}$.

Recall that we had $N_{at} \leq 2^{(1+a)^2 t^2/6 + o(t^2)}$ for $a \leq 1/2$ and $N_{at} \leq 2^{(4a+1)t^2/8 + o(t^2)}$ for $a \geq 1/2$. Therefore, we see that asymptotically $N_{at}$ is always bigger than $I_{at}$. Thus, the probabilistic argument from before gives us a subset of $n$ vertices on which the induced graph has no cliques of size $at$ and no independent sets of size $at$ as long as $n^{at} 2^{-at^2 + o(t^2)} (N_{at} + I_{at}) < 1/2$, and since $N_{at} \geq I_{at}$ (for $t$ large enough), it suffices to pick an $n$ for which $n^{at} 2^{-at^2 + o(t^2)} N_{at} < 1/2$. We can achieve this by taking $n = 2^{cat + o(t)}$ where $c = (4a - 1)/(8a^2)$ for $a > 1/2$ and $c = (4a - a^2 - 1)/(6a^2)$ for $a \leq 1/2$.

Sadly, both of these values of $c$ are $\leq 1/2$, with equality only at $a = 1/2$. Therefore, we are always worse than the basic probabilistic bound except for the case of $a = 1/2$ where we can show $R(t/2, t/2) > 2^{t/4 + o(t)}$. This asymptotically matches the probabilistic bound but we arrive at it via a different (and much more complicated) construction. For improvements, using this easy method, it looks like we’ll have to consider some other graphs. This new approach definitely gives me hope that one can find graphs with a better control on the total number of cliques and independent sets, which can then be used to give an exponential improvement in the lower bound on two-colour diagonal Ramsey numbers.