Spectral proofs of theorems on the boolean hypercube

In a recent breakthrough Hao Huang proved the sensitivity conjecture, that had remained open for past 30 years despite some serious effort from various computer scientists and mathematicians. The proof can be described in a single tweet (if you have the basic background), and it has appeared on the ELI5 page of reddit. The main ideas behind the proof are from spectral graph theory, and one of the starting points (see for example this comment) is the following bound on the independence number of a graph due to Cvetković:

Lemma 1 (Inertia Bound). Let G be a graph on N vertices and let \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_N be its eigenvalues. Let N^+ be the number of non-negative eigenvalues of G, and N^- the number of non-positive eigenvalues (counted with multiplicity). Then \alpha(G) \leq \min \{N^+, N^-\}.

The proof of this Lemma is a nice application of a classical result in mathematics known as Cauchy’s interlacing theorem (I have talked about eigenvalue interlacing earlier in this blog), and I leave it to the reader to fill in the details*. This spectral upper bound on the independence number of a graph is relatively less known than Hoffman/Delsarte bound which says \alpha(G) \leq (-\lambda_N/(\lambda_1 - \lambda_N)) N. Since many combinatorial problems can be rephrased as finding a nice upper bound on the independence number of some graph, such spectral results can be quite useful, when one has enough structure to say something about the eigenvalues. Until recently, I didn’t see any applications of the inertia bound in extremal combinatorics, whereas Hoffman’s bound is widely used; for example, in proving Erdős-Ko-Rado theorems. Interestingly, the classical Erdős-Ko-Rado theorem can also be proved using the Lemma 1 (as I learned from Hao Huang’s talk at our combinatorics seminar).

One interesting observation made by Huang regarding Lemma 1 (which was already known) is that instead of using the eigenvalues of the adjacency matrix of the graph, we could use any pseudo-adjacency matrix and still arrive at the same conclusion. A pseudo-adjacency matrix of a graph G on N vertices is any N \times N real symmetrix matrix A with the property that A_{ij} = 0 whenever the i-th vertex is non-adjacent to the j-th vertex (in an adjacency matrix we also require A_{ij} = 1 for every adjacent pair). If N^+, N^- are the number non-negative/positive eigenvalues of A, then from the proof of Lemma 1 using interlacing we again get \alpha(G) \leq \min \{N^+, N^-\}.

The flexibility of using the pseudo-adjacency matrix is crucial to Huang’s breakthrough result. In fact, before the proof of the sensitivity conjecture, Huang (in collaboration with Oleksiy Klurman and Cosmin Pohoata) used the same method to give a spectral proof of another theorem on the hypercube, known as Kleitman’s theorem, which is what I want to discuss in this blog post. In particular, I want to give a slightly simplified proof due to Aditya Potukuchi and Amey Bhangale, that my friend Aditya shared with me a few days after Huang’s sensitivity paper appeared on arXiv as an application of his methods, without realising that a spectral proof of this theorem had already appeared in an earlier paper of Huang. Here is the theorem that we will prove:

Theorem 2 (Kleitman). Let S be a set of vectors in \{0, 1\}^n such that the hamming distance between any two vectors in S is at most d. Then

\displaystyle |S| \leq \begin{cases} \sum_{i = 0}^t \binom{n}{i}, \text{ if } d = 2t \\ 2(\sum_{i = 0}^t \binom{n - 1}{i}), \text{ if } d = 2t + 1\end{cases}.

(This can be seen as the discrete analog of the isodiametric theorems in real geometry where one wants to find the largest volume of a set of points with a given diameter.)

To use Lemma 1 above, we will naturally consider the graph G on Q_n = \{0, 1\}^n where two vectors are adjacent if the hamming distance between them is at least d + 1, because then any independent set in this graph corresponds to such a set S in the theorem. What are the eigenvalues of the graph G?

The boolean hypercube, Q_n, and the graphs defined on it using hamming distance are well studied objects. One way of computing the eigenvalues of G is to look at it as the Cayley graph Cay(\mathbb{Z}_2^n, \mathbb{Z}_2^n \setminus B_d^n), where B_d^n is the Hamming ball of radius d, that is, all vectors whose Hamming weight is at most d. From the basic theory of Cayley graphs, it then follows that G has n + 1 distinct eigenvalues \lambda_0, \lambda_1, \dots, \lambda_n with

\displaystyle \lambda_i = \sum_{k = d + 1}^n \mathcal{K}_k^n(i)

appearing with multiplicity \binom{n}{i}, where \mathcal{K}_k^n(i) is the Kravchuk polynomial evaluated at i, defined as

\displaystyle \mathcal{K}_k^n(x) = \sum_{j = 0}^k (-1)^j \binom{x}{j} \binom{n - x}{k - j}.

Note that \mathcal{K}_k^n(x) is a polynomial of degree k. For ease of notation, we will skip the superscript from now on and just write \mathcal{K}_k instead. These polynomials have several nice properties, and we will use the following symmetry (or should it be called anti-symmetry?):

\displaystyle \mathcal{K}_k(i) = (-1)^i \mathcal{K}_{n - k}(i)

which lets us see the evaluation of a degree k polynomial at i as the evaluation of a degree n - k polynomial at i, up-to a factor of (-1)^i. This fact will come in handy later.

Now if you apply Lemma 1 to G using these eigenvalues, you will realise (after some calculation) that it does not give the required bound (what bounds do you get?). That’s where pseudo-adjacency matrices come into picture. We will define a weight function w:[n] \rightarrow \mathbb{R}, such that every vector with i non-zero values is given the weight w(i), and w(i) = 0 for all i \leq d. From this weight function we will get the pseudo-adjacency matrix A of G where A_{ij} = w(k) if the Hamming distance between the i-th and the j-th vectors is equal to k. Again, from the basic theory of Cayley graphs (in particular, calculating Fourier coefficients), or more directly (Lemma 2.3 here), we get that A has n + 1 distinct eigenvalues,

\displaystyle \lambda_i = \sum_{k = d + 1}^n w(k) \mathcal{K}_k(i)

with multiplicity \binom{n}{i}, with i = 0, 1, \dots, n. Now, we just have to choose the function w cleverly so that either N^+, or N^-, is at most what we need it to be. Here is when Huang, Klurman and Pohoata make the clever choice of w(k) = \binom{\lfloor (k - 1)/2 \rfloor}{t}, for the case of d = 2t. For the odd case, d = 2t + 1, there is a different choice, but from now on we will only stick to the even case of Theorem 2. From this choice, they get the following three properties,

  • (-1)^i \lambda_i > 0, for i = 0, \dots, t
  • \lambda_{n - i} = \lambda_{i + 1}, for i = 0, \dots, t - 1
  • \lambda_{t + 1} = \lambda_{t + 2} = \dots = \lambda_{n - t} = (-1)^{t + 1},

from which the result then follows by an easy analysis. The way they prove these properties is using generating functionology, and it is not immediately clear how the choice of w(k) came about. What Aditya and Amey do instead, is start with the properties that one wants from the eigenvalues, and then show the existence of the function w, without ever computing it explicitly!

Here is how this is done. Observe that using the symmetric property of Kravchuk polynomials, we can write

\displaystyle \lambda_i = \sum_{k = 2t + 1}^n w(k) \mathcal{K}_k(i) = (-1)^i \sum_{k = 2t + 1}^n w(k) \mathcal{K}_{n - k}(i)

In the second formulation, we can see \lambda_i as (-1)^i multiplied with a linear combination of polynomials of degrees n - 2t - 1, n - 2t - 2, \dots, 1, 0, evaluated at i. These polynomials are in fact linearly independent, and hence any polynomial f(x) of degree at most n - 2t - 1 can be written as a linear combination of them. Therefore, we can choose a nice f(x), for which the eigenvalues will be \lambda_i = (-1)^i f(i), and then get the weights w(k)‘s automatically.

Pick f(x) as the unique monic polynomial whose roots are t + i + 0.5, for i = 1 to n - 2t - 1. WLOG, assume that f(0) > 0 (if not, then we can take -f). Since f is a polynomial, the sign of its values oscillates between the roots. In particular, this shows that \lambda_i = (-1)^i f(i) has the same sign for i = t + 1, \dots, n - t, that is, \lambda_i > 0 if t is odd (since f(0) and f(t+1) have the same sign), and \lambda_i < 0 if t is even, for all of these i‘s. In the former case, we can carefully count the number of negative eigenvalues with multiplicities, and in the latter case we can count the number of positive eigenvalues with multiplicities, giving us Theorem 2.

Note that as far as the signs of \lambda_i‘s are concerned, this trick of picking the polynomial f explicitly, and then getting the weights, gives us the same three properties as we had earlier where Huang, Klurman and Pohoata made the explicit choice of the weight function. For the case of Theorem 2 when d = 2t + 1, we have to do something different, and I leave it to the reader to figure that case out.

For some open questions that can potentially be solved using these methods, check out Section 5 of the Huang-Klurman-Pohoata paper.

(*) Fedor Petrov has given a “low-level” proof that avoids using the interlacing theorem (congrats to Fedya for his new blog!).

For discussions on Huang’s proof of the sensitivity conjecture, see the following blog posts: Terry Tao, Gil Kalai, Scott Aaronson, Ken Reagan.

Advertisement

About Anurag Bishnoi

A mathematician working at TU Delft. I am broadly interested in combinatorics and finite geometry.
This entry was posted in Combinatorics, Extremal Combinatorics, Spectral Graph Theory and tagged , , , , , , , . Bookmark the permalink.

5 Responses to Spectral proofs of theorems on the boolean hypercube

  1. Pingback: Huangs’ Breakthrough, Cvetković’s Bound, Godsil’s Question, and Sinkovic’s Answer | Ratio Bound – A Combinatorics Blog

  2. Pingback: Amazing: Hao Huang Proved the Sensitivity Conjecture! | Combinatorics and more

  3. “For the case of Theorem 2 when d = 2t + 1, we have to do something different, and I leave it to the reader to figure that case out.”

    I am currently at a conference with Anurag and Aditya. When Anurag wrote the above. I thought that Anurag meant the following easy argument. As they were not aware of it, here it is:

    Let S be the independent set as above. Let A = \{ x: 1x \in S, x \text{ has even weight.} \}, B = \{ x: 0x \in S, x \text{ has even weight.} \}, C = \{ x: 1x \in S, x \text{ has odd weight.} \}, D = \{ x: 0x \in S, x \text{ has odd weight.} \}.

    My claim is that A \cup D is a subset of Q_{n-1} such that all elements have pairwise distance at most 2t and that the same holds for B \cup C.

    Elements of A have even weight, so they have even pairwise distance and their minimum distance is at most 2t. Similarly, elements of D have pairwise distance at most 2t to each other. Elements in A have even weight and vertices in D have odd weight, so a vertex x in A has an odd distance to a vertex y in D. This distance cannot be 2t+1 as then x and y have distance 2t+2 in S. This shows the claim.

    The same argument works for B \cup D. Hence, we can just apply the bound for d=2 and are done.

    • Or maybe shorter:

      Let A' be the set with only even weight vertices. Let A be { x: 1x \in A' \text{ or } 0x \in A' }. Then the pairwise distances in A are all \leq 2t. Hence, one can apply the bound in Q_{n-1} for even d to |A'|.

      Same works for the subset of all even weight vertices. That gives the bound.

      • Really nice proof! One thing that needs to be verified as well is that A \cup D is equal to (or just at least) the size of the set of vectors in S that they correspond to, that is, you do not have an x \in A \cup D such that both 1x and 0x are elements of S (which does not happen because x can’t have both even and odd weight).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s