Alon-Furedi, Schwartz-Zippel, DeMillo-Lipton and their common generalization

In the post Balls in Bins I wrote about a combinatorial function \mathfrak{m}(a_1, \dots, a_n; k) which denotes the minimum value of the product y_1 \cdot y_2 \cdots y_n among all distributions (y_1, y_2, \dots, y_n) of k balls (so y_1 + y_2 + \dots + y_n = k) in n bins with the constraints 1 \leq y_i \leq a_i. It turns out that this combinatorial function is linked to zeros of a multivariable polynomial in a “finite grid”. Here, by a finite grid I mean a subset A \subseteq \mathbb{F}^n of the form A = A_1 \times \cdots \times A_n where \mathbb{F} is a field and A_i‘s are finite subsets of \mathbb{F}. The Alon-Furedi theorem, which provides this link, is as follows:

Theorem 1 (Alon-Furedi). Let A = A_1 \times \cdots \times A_n be a finite grid in \mathbb{F}^n where |A_i| = a_i. Let f \in \mathbb{F}[t_1, \dots, t_n] be an n-variable polynomial of degree d such that the set \mathcal{U}_A(f) = \{x \in A : f(x) \neq 0\} is nonempty (in other words, f does not vanish on the entire grid). Then |\mathcal{U}_A(f)| \geq \mathfrak{m}(a_1, \dots, a_n; \sum_{i = 1}^n a_i - d). Moreover, the bound is sharp.

For the sharpness of this bound let (y_1, \dots, y_n) be any feasible distribution of \sum_{i = 1}^n a_i - d balls in bins. Pick subsets B_i \subseteq A_i with |B_i| = y_i and define f = \prod_{i = 1}^n \prod_{\lambda \in A_i \setminus B_i} (t_i - \lambda) which has degree \sum_{i = 1}^n (a_i - y_i) = d and doesn’t vanish on y_1 \cdot y_2 \cdots y_n points of the grid A.

This theorem appears in the last pages of [1] as Theorem 5, and until the appearance of [4] where Clark, Forrow and Schmitt used it to obtain a generalization of Warning’s second theorem and several interesting results in combinatorics and additive number theory, it seems like this result was largely ignored. In [3] Pete further generalized the results of [4] and gave some nice applications to graph theory, additive group theory and polynomial interpolation. In fact, he also generalized the Alon-Füredi theorem by replacing the field \mathbb{F} by a commutative ring R and imposing some extra conditions on the grid which are vacuously true for the field case. While discussing some results of [4] with Pete and John I realised that we can do much more with this Alon-Furedi theorem, and this resulted in my joint paper with Pete, Aditya and John [2], which I have also discussed here. A particular contribution of [2] is the following generalization of Alon-Furedi (Theorem 3 below) for which we first need some motivation.

The famous Schwartz-Zippel lemma says that a polynomial in \mathbb{F}_q[t_1, \dots, t_n] of degree d \leq q has at most dq^{n-1} zeros in \mathbb{F}_q^n. In fact it works over more general grids as well where it says that if A_1, A_2, \dots, A_n are finite subsets of \mathbb{F} with |A_1| \geq |A_2| \geq \cdots \geq |A_n| and f a polynomial of degree d \leq |A_n|, then f has at most d|A_1||A_2|\cdots|A_{n-1}| zeros in A_1 \times \dots \times A_n. Since Theorem 1 gives a lower bound on the number of non-zeros of a polynomial in a finite grid, it automatically gives an upper bound on the numbers of zeros of that polynomial. This observation leads to a natural question.

When the degree of the polynomial is at most |A_n|, does Theorem 1 imply the Schwartz-Zippel lemma? 

If you do the math, then you would see that indeed, it does. So, the Alon-Füredi theorem is in fact a generalization of the so-called Schwartz-Zippel lemma (see the blog post by RJ Lipton and some comments on it for why I have used “so-called” in this sentence). But, there are other results in the spirit of Schwartz-Zippel lemma, which do not follow from Theorem 1! The following result is due to DeMillo-Lipton and Zippel:

Theorem 2. Let f \in \mathbb{F}[t_1, \dots, t_n] be a non-zero polynomial such that for all i, we have \deg_{t_i} f \leq d for some constant d. Then for a subset S \subseteq \mathbb{F} of cardinality at least d + 1, the number of zeros of f in S^n is at most |S|^n - (|S| - d)^n

To see that this result doesn’t follow from Theorem 1, take S = \{0, 1, 2\} \subseteq \mathbb{Q} and f = t_1 t_1 \in \mathbb{Q}[t_1, t_2]. Then Theorem 1 tells us that f has at most 6 zeros in S^2, while Theorem 2 tells us that f has at most 5 zeros (which is stronger and the sharp bound in this case). This doesn’t mean that DeMillo-Lipton/Zippel’s result is always better. In fact, if you take f = t_1 + t_2, then Theorem 1 gives the better bound. To “rectify” this situation was one of our main motivation to give the following generalization of Theorem 1, which appears as Theorem 1.2 in [2].

Theorem 3 (Generalized Alon-Furedi). Let A = A_1 \times \cdots \times A_n be a finite grid in \mathbb{F}^n where |A_i| = a_i.  For i \in \{1, 2, \dots, n\} let b_i be an integer such that 1 \leq b_i \leq a_i. Let f \in \mathbb{F}[t_1, \dots, t_n] be an n-variable polynomial of degree d such that the set \mathcal{U}_A(f) = \{x \in A : f(x) \neq 0\} is nonempty and \deg_{t_i} f \leq a_i - b_i for all i. Then we have |\mathcal{U}_A(f)| \geq \mathfrak{m}(a_1, \dots, a_n; b_1, \dots, b_n; \sum_{i = 1}^n a_i - d). Moreover, the bound is sharp.

Here \mathfrak{m}(a_1, \ldots, a_n; b_1, \ldots, b_n; k) is defined to be the minimum value of the product y_1 \cdot y_2 \cdots y_n among all distributions (y_1, y_2, \dots, y_n) of k = y_1 + y_2 + \dots + y_n balls in n bins with the constraints b_i \leq y_i \leq a_i. If you “do the math” (see Section 4 of [2]), then Theorem 3 is a common generalization of Theorem 1, Schwartz-Zippel lemma and Theorem 2.

[1] N. Alon and Z. Füredi, Covering the cube by affine hyperplanes. Eur. J. Comb. 14 (1993), 79–83.
[2] A. Bishnoi, P. L. Clark, A. Potukuchi, J. R. Schmitt, On Zeros of a Polynomial in a Finite Grid (2015). arXiv preprint arXiv:1508.06020.
[3] P. L. Clark, Fattening up Warning’s Second Theorem (2015). arXiv preprint arXiv:1506.06743. 
[4] P. L. Clark, A. Forrow and J. R. Schmitt, Warning’s Second Theorem With Restricted Variables. To appear in Combinatorica, (2016).

Posted in Combinatorics, Polynomial Method | Tagged , , , | 2 Comments

A timeline of the polynomial method up-to combinatorial nullstellensatz

Over the past 30-40 years, the so-called polynomial method has developed into a powerful tool in combinatorics and (additive) number theory. There has been a lot of recent interest in it after Dvir’s  paper on the Kakeya conjecture, where he solved a major open problem by a short and easy polynomial argument. After that, many prominent mathematicians from different areas of mathematics started looking into these techniques, and in some cases, ended up resolving some famous open problems: On the Erdős distinct distances problem in the plane.

In this post I will give a timeline of some important papers (in my opinion) related to the polynomial method, starting from Tom H. Koornwinder’s paper on equiangular lines till the appearance of Combinatorial Nullstellensatz by Noga Alon (1999) (so, this timeline will cover only the part A of polynomial method as mentioned here). But first, here are some quotes that describe the general philosophy behind the polynomial method:

Given a set of points (or vectors, or sets) that satisfy some property, we want to say something about the size or the structure of this set. The approach is then to associate to this set a polynomial, or a collection of polynomials, and use properties of polynomials to obtain information on the size or structure of the set.

– Aart Blokhuis, 1993

Broadly speaking, the strategy is to capture (or at least partition) the arbitrary sets of objects (viewed as points in some configuration space) in the zero set of a polynomial whose degree (or other measure of complexity) is under control; for instance, the degree may be bounded by some function of the number of objects. One then uses tools from algebraic geometry to understand the structure of this zero set, and thence to control the original sets of object.

– Terence Tao, 2013

A note on the absolute bound for systems of lines, by Koornwinder

On Two-Distance Sets in Euclidean Space, by Larmen, Rogers and Seidel
Covering finite fields with cosets of subspaces, by Jamison

The blocking number of an affine space, by Brouwer and Schrijver

A New Upper Bound for The Cardinality of 2-Distance Sets in Euclidean Space, by Blokhuis
Intersection theorems  with geometric consequences, by Frankl and Wilson
Remarks on a theorem of Rédei, by Lovasz and Schrijver

Regular subgraphs of almost regular graphs, by Alon, Friedland and Kalai
Every 4-regular graph plus an edge contains a 3-regular subgraph, by Alon, Friedland and Kalai

– A characterization of exterior lines of certain sets of points in PG (2, q), by Blokhuis and Wilbrink

A nowhere zero point in linear mappings, by Alon and Tarsi
A short proof of the nonuniform Ray-Chaudhuri-Wilson inequality, by Babai
Balancing set of vectors, by Alon, Bergmann, Coppersmith and Odlyzko

Sum zero (mod n), size n subsets of integers, by Bailey and Richter

Multilinear polynomials and Frankl-Ray-Chaudhuri-Wilson type intersection theorems, by Alon, Babai and Suzuki

Polynomial multiplicities over finite fields and intersection sets, by Bruen
Colorings and orientations of graphs, by Alon and Tarsi

The Dinitz problem solved for rectangles, by Janssen
Covering the Cube by Affine Hyperplanes, by Alon and Furedi
Polynomials in finite geometries and combinatorics, by Blokhuis
Zero-sum sets of prescribed size, by Alon and Dubiner

On the size of a blocking set in PG(2,p), by Blokhuis
On nuclei and affine blocking sets, by Blokhuis
– On multiple nuclei and a conjecture of Lunelli and Sce, by Blokhuis

Tools from higher algebra, by Alon
Adding Distinct Congruence Classes Modulo a Prime, by Alon, Nathanson and Ruzsa
A new proof of several inequalities on codes and sets, by Babai, Snevily and Wilson
The number of directions determined by a function f on a finite field, by Blokhuis, Brouwer and Szonyi

The Polynomial Method and Restricted Sums of Congruence Classes, by Alon, Nathanson and Ruzsa
Multiple blocking sets and arcs in finite planes, by Ball

Blocking sets in Desarguesian affine and projective planes, by Szonyi
– Some Applications of Algebraic Curves in Finite Geometry and Combinatorics, by Szonyi

An easier proof of the maximal arcs conjecture, by Ball and Blokhuis
Sumsets in Vector Spaces over Finite Fields, by Eliahou and Kervaire

– Multilinear Polynomials and a Conjecture of Frankl and Füredi, by Sankar and Vishwanathan
Combinatorial Nullstellensatz, by Alon

Beyond this, many excellent surveys on the polynomial method are available in which one can find further developments. For example,

Posted in Combinatorics, Finite Geometry, Polynomial Method | Tagged , , , | 1 Comment

On Zeros of a Polynomial in a Finite Grid: the Alon-Furedi bound

My joint paper with Aditya Potukuchi, Pete L. Clark and John R. Schmitt is now up on arXiv: arXiv:1508.06020.

This work started a few months back when I emailed Pete and John, pointing out an easy generalization of Chevalley-Warning theorem using something known as the punctured combinatorial nullstellensatz, after reading their paper on Warning’s second theorem: arXiv:1404.7793. They got pretty excited about it and we started discussing some related things. Finally, it’s not the generalization of Chevalley-Warning that this paper is about but the theorem of Alon-Füredi itself, which is the main tool they used in their paper to generalize Warning’s second theorem. In our discussions we found several unexplored connections between this elementary result on polynomials and results from different areas of maths. My friend Aditya joined us in between with his amazingly simple proof of Alon-Füredi which, along with the annoying realization that a result of DeMillo-Lipton-Zippel doesn’t follow from Alon-Füredi, led us to generalize Alon-Füredi.

These results when combined with the recent work by Pete, Aden Forrow and John (and the follow up by Pete: arXiv:1506.06743) shows how powerful this Alon-Füredi bound is. Among many other things, it implies the following results:

  1. Chevalley-Warning theorem, Warning’s second theorem and their generalizations.
  2. Olson’s theorem on the Davenport constant of p-groups.
  3. Schwartz-Zippel lemma (or rather DeMillo-Lipton-Schwartz-Zippel lemmas).
  4. Minimum Hamming distance of Reed-Muller type affine variety codes.
  5. Jamison/Brouwer-Schrijver bound on affine blocking sets.

I am hopeful that there are many other connections which will be discovered. Much like Alon’s Combinatorial Nullstellensatz, this is a pretty basic result on polynomials with several interesting applications. In some subsequent posts, I will try to discuss this in much more detail.

Posted in Combinatorics, Finite Geometry, Polynomial Method | Tagged , , , | 3 Comments

Discrete version of the Intermediate Value Theorem

While working on a combinatorial problem with Potu today I came up with an easy theorem that can be called a discrete version of the Intermediate Value Theorem. It can be stated as follows.

For integers a < b, let f be a function from the integers in [a, b] to \mathbb{Z} that satisfies the property, |f(i+1) - f(i)| \leq 1 for all i. If f(a) < 0 < f(b), then there exists an integer c \in (a, b) such that f(c) = 0.

Note the similarity with the continuous version of this theorem:

For real numbers a < b, let f be a continuous function from [a, b] to \mathbb{R}. If f(a) < 0 < f(b), then there exists a real number c \in (a, b) such that f(c) = 0.

The discrete analog of continuity is the property that going from integer i to i + 1, we make a jump of at most 1. It turns out that the similarity of these two theorems extends to their proofs as well. Here’s a proof of the discrete version:

Proof. Let S = \{x \in \mathbb{Z} \cap [a, b] : f(x) < 0\} and let c = \max S + 1. We claim that f(c) = 0.
Say f(c) < 0. Then c \in S. This contradicts the fact that c-1 is an upper bound on S. Say f(c) > 0. This implies that f(c-1) \geq 0 (by “continuity”), which contradicts the fact that c - 1 \in S. \blacksquare

And here’s the standard proof for the continuous version.

Proof.  Let S = \{x \in [a, b] : f(x) < 0\} and let c = \sup S. We claim that f(c) = 0.
Say f(c) < 0. Pick a number c' > c such that f(c') < 0 (we can do so because of continuity of f). Then c' \in S. This contradicts the fact that c is an upper bound. Say, f(c) > 0. Then there exists a \delta > 0 such that f(x) > 0 for all x \in (c - \delta, c + \delta) (again by continuity). But, by the definition of supremum there exists an element c'' \in S satisfying c - \delta < c'' < c. Then f(c'') > 0, which contradicts the fact that c'' \in S. \blacksquare

Here’s the combinatorial theorem where the discrete version showed up. Let S_1, \dots, S_{2n} be subsets of \{1, \dots, 4n\} defined by S_i := \{i, i + 1, \dots, i + 2n - 1\}. Then for every subset T \subset \{1, \dots, 4n\} of cardinality 2n there exists a k such that |S_k \cap T| = n.

Proof. Define f(i) = |T \cap S_i| - n for i \in \{1, \dots, 2n\}. Then it’s easy to check that |f(i+1) - f(i)| \leq 1 for all i. In fact, we can define the function on i = 2n + 1 as well by taking S_{2n + 1} = \{2n + 1, 2n + 2, \dots, 4n\}. Since T has cardinality 2n we have f(1) < 0 implies f(2n+1) > 0, and vice versa. Say f(1) \neq 0. Then without loss of generality we may assume that f(1) < 0. Then f(2n + 1) > 0. Therefore, there exists a k \in \{2, \dots, 2n\} such that f(k) = 0. \blacksquare

After I wrote this argument in my notes, I googled “discrete intermediate value theorem” and found out that someone has already written about this theorem. See this article by Richard Johnsonbaugh. He gives another cute application of this theorem. It’ll be fun to find more applications of this easy theorem.

Edit: As pointed out by Peter, a similar problem is discussed here:

Posted in Real Analysis | Tagged | 3 Comments

Chevalley-Warning Theorem and Blocking Sets

The classical Chevalley-Warning theorem gives us a sufficient condition for a system of polynomial equations over a finite field to have common solutions. Affine blocking sets are sets of points in an affine geometry (aka affine space) that intersect every hyperplane. What’s the connection between them? It appears to be the following curious property of polynomials that vanish on all points of \mathbb{F}_q^n except one:

Lemma Let {P \in \mathbb{F}_q[x_1, \ldots, x_n]} such that {P(a) \neq 0} for some {a = (a_1, \ldots, a_n) \in \mathbb{F}_q^n} but  {P(x) = 0} for all {x \in \mathbb{F}_q^n \setminus \{a\}}. Then {\deg P \geq n(q-1)}.

In this post, I will show how this property implies both the Chevalley-Warning (CW) theorem, and the Jamison/Brouwer-Schrijver bound on the size of affine blocking sets. The Lemma itself can be proved in many different ways. In my mathoverflow answer I have mentioned six different proofs (how “different” two proofs are can vary). But there are even more ways to prove this lemma, some of which I will discuss in later blog posts.

CW theorem states that given k polynomials P_1, \dots, P_k in \mathbb{F}_q[x_1, \dots, x_n] with \sum \deg P_i < n, if they have a common zero, then they have another common zero in \mathbb{F}_q^n

Proof. Define P = (1 - P_1^{q-1})(1 - P_2^{q-1}) \cdots (1 - P_k^{q-1}). The polynomial P has the property that it is equal to 1 at all the common zeroes and 0 otherwise. For sake of contradiction assume that P takes the value 1 at exactly one point, i.e., there is a unique common zero. Then, from the lemma above we see that \deg P = \sum (q-1) \deg P_i \geq n(q-1), which contradicts that \sum \deg P_i < n. Hence, there must be another common zero. (Warning’s second theorem states that there are at least q^{n-\deg P} common zeroes).    \blacksquare

This proof is essentially due to Chevalley. For a historical discussion on this theorem and its proofs see this. For a full survey, wait till Pete Clark finishes his book size notes titled “Around the Chevalley-Warning Theorem”.

Now we turn to affine blocking sets. As discussed here blocking sets appeared in 1955 as a game theoretic concept. Later on some mathematicians picked up the concept and started studying it for its own sake. The kind of blocking sets we’ll be concerned with are called affine blocking sets. These are sets of points in \mathbb{F}_q^n that intersect every hyperplane. A simple example of such a set is given by taking the n(q-1) + 1 points on the axes,  i.e., the points \lambda e_i for \lambda \in \mathbb{F}_q, where e_i is the vector with 1 at i-th position and zero elsewhere. It was conjectured by J. Doyen in 1976 at an Oberwolfach lecture that this is the least possible size of an affine blocking set. An year later two independent proofs of this appeared, one by R. E. Jamison, and another by Andries E. Brouwer and Alexander Schrijver. The latter was a short argument which can be seen as a direct application of the lemma above (which they proved!). Let’s look at that argument in a slightly different way than how it appears in their paper.

Every hyperplane corresponds to a linear equation of the form a_1x_1 + \dots + a_nx_n = \lambda, i.e., every hyperplane is the zero set of a degree 1 polynomial.  And hence union of d hyperplanes correspond to the zero set of a degree d polynomial which is the product of all the corresponding degree 1 polynomials. The lemma above shows that if there are d hyperplanes that cover all points of \mathbb{F}_q^n except one, then d \geq n(q-1). To use this in our problem, we first embed the points of blocking set in a projective space by adding a hyperplane at infinity, and then look at the dual problem. An affine blocking set has the property that it intersects every hyperplane except the hyperplane at infinity. The number of hyperplanes in \mathbb{P}^n(\mathbb{F}_q) that cover all points except one is one more than such hyperplanes in the corresponding affine space because of an extra hyperplane at infinity. Thus, by projective duality we have at least n(q-1) + 1 points in the blocking set.

The curious property of polynomials mentioned in the main Lemma of this post has been generalised to some other cases and it has been used to solve some interesting problems, which I might discuss in future posts. For now, see this interesting paper by Blokhuis, Brouwer and Szőnyi: Covering all points except one.

Edit (23 Jan 2016): This paper by Noga Alon gives a more direct connection between the Jamison theorem and the Chevalley-Warning theorem for the prime case: Tools from Higher Algebra. See page Theorem 6.5 on page 23.

Posted in Polynomial Method | Tagged , , | Leave a comment

Balls in Bins

When I was learning combinatorics for the first time (I was probably 16) there was this problem about distributing m balls in n bins (or urns) that I came across. We had to count the total number of  such distributions, given certain constraints on the maximum capacities of bins. For example, assuming that each bin must have at least one ball and that there is no upper limit on their capacity, we showed that there are {m - 1 \choose n - 1} ways of distributing m identical balls in n distinct bins. Clearly, this is equal to the number of solutions of x_1 + \cdots + x_n = m, where each x_i must be a positive integer. Not so clearly, this is also equal to the coefficient of t^{m-n} in (1-t)^{-n}.

A few days back, this problem appeared again. This time, in the context of some research problems that I am working on. I will explain what the context is in some later posts. For now, let’s stick to elementary combinatorics.

We are given n bins A_1, \ldots, A_n, where for every i, the bin A_i has maximum capacity a_i. We are going to distribute m balls in these n bins such that each bin gets at least one ball. But instead of counting the total number of ways of doing this, we are interested in finding distributions that minimize the product x_1 \cdots x_n, where x_i is the number of balls in the bin A_i. Since the product doesn’t depend on the order, we will assume that a_1 \geq a_2 \cdots \geq a_n. We will denote the minimum value of the product by \mathfrak{m}(a_1, \ldots, a_n; m).

Let us start with a small example. Say we have 5 balls that we want to distribute in two bins A and B of capacities 4 and 3, respectively. Then, these are the all possible distributions: (4, 1), (3, 2), (2, 3). The minimum product is attained at  (4, 1), and hence \mathfrak{m}(4, 3; 5) = 4.  Before reading further, you might want to play around with some more examples to see if a pattern emerges …

So, here’s how a solution goes [1]. Define a greedy distribution to be the one where you first place a single ball in each of the n bins, and then the remaining m-n balls are placed from left to right, moving from the bin A_i to the bin A_{i+1} only after you have already placed a_i (the maximum capacity of A_i) balls in A_i. We will now prove that this greedy distribution minimises the product by showing that from any given distribution x = (x_1, \ldots, x_n), we can reach the greedy distribution by performing certain moves, such that after each such move the product remains the same or reduces.  The moves that we’ll perform are of the following two types:

  1. Bin Swap: If there are bins A_i and A_j, with i < j, such that x_i < x_j, then swap the balls in these two bins.
  2. Unbalancing Move: If there are bins A_i and A_j, with i < j, such that a_i > x_i \geq x_j > 1, then move a ball from the bin A_j to the A_i.

In the first move, the product doesn’t change. While in the second move, the product reduces, as the ratio of the new product to the old product is equal to \frac{(x_i + 1)(x_j - 1)}{x_ix_j} = \frac{x_ix_j + x_j - x_i - 1}{x_ix_j}, which is less than 1 because x_j \leq x_i.

After some thought, it should be clear that by first performing Bin Swaps, and then doing the Unbalancing Moves on any starting distribution x, we can reach the greedy distribution.This shows that the greedy distribution must minimize the product. Of course, it might be that it is not the unique distribution that minimises the product.

We can now deduce the following facts about the function \mathfrak{m}(a_1, \ldots, a_n; m) that will be useful later.

  1. \mathfrak{m}(2, \ldots, 2; 2n - k) = 2^{n-k}.
  2. \mathfrak{m}(a, \ldots, a; m) = (r+1)a^k, where r is the remainder when you divide m-n by a - 1, and k is the quotient.


Posted in Combinatorics, Polynomial Method | Tagged | 2 Comments

The Kakeya problem

The original Kakeya needle problem  is to  find the least amount of area required to continuously rotate a unit line segment in the (Euclidean) plane by a full rotation. Of course in a circle of diameter one we can continuously rotate a unit line segment by 360 degrees, but there are figures with smaller area than that (for example an equilateral triangle of height 1). In fact, perhaps surprisingly, it was shown by Besicovitch in 1928 that for any \epsilon > 0 there exists a planar set of area \epsilon which solves the Kakeya needle problem. Any set which contains a unit segment in each direction is now called a Besicovitch set or a Kakeya set. Now the main open problem in real spaces is,

Do Kakeya sets in \mathbb{R}^n have Hausdorff dimension equal to n?.

Since I don’t know enough analysis or topology to elaborate too much on that, I would stick to providing a link to some expositions on this problem and its connection to various areas of mathematics, see this and this. What I do know is combinatorics and linear algebra. And luckily there is an interesting variant of this problem over  finite fields.

In 1999 Wolff suggested a variation of this problem which asks for the smallest subset of \mathbb F_q^n that contains a line in each direction, where \mathbb F_q is the finite field of order q (see this).  He conjectured that the size of the smallest set is bounded below by c_nq^n where c_n is constant that only depends on n.

This conjecture was finally settled by Zeev Dvir in 2008 (see [2]) as he showed that any Kakeya set in \mathbb F_q^n has size at least {q + n - 1 \choose n} (which is greater than or equal to q^n/n! by a simple reasoning). What’s really interesting is that this problem remained unsolved even after a lot of effort by several prominent mathematicians, and ultimately the proof of Dvir turned out to be surprisingly elementary. Quoting Terence Tao [3]: “Results from basic algebraic geometry had been brought to bear on the finite field Kakeya conjecture in [74], but with only partial success. It was thus a great surprise when the conjecture was fully resolved by Dvir”

I will discuss this beautiful proof here. First, we would need this result on polynomial interpolation.

Lemma 1 Let F be a field, let n \geq 1 be an integer, and d \geq 0. If E \subseteq F^n has cardinality less than \binom{d + n}{n} then there exists a non-zero polynomial P \in F[x_1, \ldots, x_n] of degree at most d such that E \subseteq Z(P)[F], where Z(P)[F] is the set of zeroes of P in F^n.

Before proving this, let’s look at some applications which are well known results in geometry (take n = 2):

  1. Every pair of points in F^2 is contained in a line.

  2. Every set of five points in F^2 is contained in a (possibly degenerate) conic section.

Note that a line in F^2 is determined by a polynomial of degree 1 in F[x_1, \ldots, x_2], while a conic section is determined by a polynomial of degree 2.

Proof . The polynomials with degree at most d form a \binom{d+n}{n} dimensional subspace F_d[x_1, \ldots, x_n] of the infinite dimensional vector space F[x_1, \ldots, x_n] (see this for a complete reasoning). Now look at the evaluation map from F_d[x_1, \ldots, x_n] to F^E for any given set E \subseteq F^n, which maps a polynomial P to the tuple (P(x))_{x \in E}. Since the vector space F^E has dimension |E| , if |E| < {d + n \choose n} = dim(F_d[x_1, \ldots, x_n]) then the evaluation map has a non-trivial kernel, which means that there is a non-zero polynomial P \in F_d[x_1, \ldots, x_n] that vanishes on all points of E.

Now the main result.

Theorem Let F be a finite field of order q, let n \geq 1 be an integer, and let E \subseteq F^n be a Kakeya set. Then |E| \geq {q + n - 1 \choose n}. In particular, we have |E| \geq q^n/n!.

Proof. Say |E| < {q + n - 1 \choose n}. By Lemma 1 there exists a non-zero polynomial P \in F[x_1, \ldots, x_n] of degree d at most q-1 which vanishes on E. For every direction v \in F^n \setminus \{0\} there exists a line L_v = \{x + tv : t \in F\} which is completely contained in E, i.e. the univariate polynomial P(x + tv) vanishes for all t. Since degree of P(x + tv) is less than q but it has q zeroes, it must be the zero polynomial.

If P_{=d} is the homogeneous part of degree d \leq q-1 in P then the argument above shows that P_{=d}(v), which is the coefficient of t^d in P(x + tv), is zero for all v \in F^n \setminus \{0\}. This contradicts Lemma 1 in this post (or the Schwartz-Zippel lemma).

Note the similarity between this and the first proof of Schwartz-Zippel lemma discussed here. In fact Moshkovitz was inspired by Dvir’s proof to come up with that proof.

It should also be noted that this bound is not always sharp. For example, Blokhuis and Mazzocca showed in [1] that the exact bound for n = 2 and q odd is q(q+1)/2 + (q-1)/2. Interestingly they also showed a weaker bound of q^{n-1}/(n-1)! for general n. Later on Dvir, Kopparty, Saraf and Sudan improved the bound in general case to q^n/2^n using polynomial multiplicities, something I will discuss in another blog post.

The proof of finite field Kakeya conjecture (now a theorem) can also be seen projectively by homogenization of the polynomial. But,

there is no known proof of this which doesn’t use the polynomial method.

I would end this with an interesting remark by Terence Tao on the proof by Dvir (see footnote at page 8 in [3]):

To illustrate the radical change in perspective that the polynomial method brought to this subject, it had previously been observed in that a Kakeya set could not be contained in the zero set of a low degree polynomial, by essentially the same argument as the one given above. However, this fact was deemed “far from a proof that the Kakeya conjecture is true”, due to ignorance of the polynomial method.

[1] Aart Blokhuis and Francesco Mazzocca. The finite field kakeya problem. In Building Bridges, pages 205–218, 2008. Available at
[2] Zeev Dvir. On the size of kakeya sets in finite fields. J. Amer. Math. Soc., 22:1093–1097, 2009. Available at
[3] Terence Tao. Algebraic combinatorial geometry: the polynomial method in arithmetic combinatorics, incidence combinatorics, and number theory. EMS Surv. Math. Sci., pages 1–46, 2014. Available at

Further Reading
Algebraic methods in discrete analogs of the Kakeya problem by Guth and Katz.
– Extensions to the method of multiplicities, with applications to Kakeya sets and mergers, by Dvir, Kopparty, Saraf and Sudan.
– The Kakeya set and maximal conjectures for algebraic varieties over finite fields by Jordan Ellenberg, Richard Oberlin and Terence Tao.
– Kakeya sets over non-archimedian lcoal rings
– Variants of the Kakeya problem over an algebraically closed field by Kaloyan Slavov.
A finite version of the Kakeya problem by Simeon Ball, Aart Blokhuis and Diego Domenzain.

* this post is a part of notes on the polynomial method that I am (was?) working on with Abhishek Khetan and Aditya Potukuchi

Posted in Finite Geometry, Polynomial Method | Tagged , , | 1 Comment