Two proofs of the Schwartz-Zippel lemma

The fact that a univariate polynomial P(x) over a field F of degree d has at most d zeroes is well known. It follows from the so called Factor theorem, P(a) = 0 if and only if P(x) = (x - a)Q(x). But what about a polynomial in n-variables, P \in F[x_1, \ldots, x_n], where n > 1?

When F is an infinite field, say \mathbb R or \mathbb C, then the number of zeroes would almost always be infinite. For example, x^2 + y^2 - 1 has infinitely many zeroes in \mathbb{R}^2. But things are a bit more interesting when F is a finite field. We have the following important result bounding the number of zeroes of a polynomial:

(Schwartz-Zippel lemma.) Let F be a finite field of size q, let n \geq 1, and let P \in F[x_1, \ldots, x_n] be a non-zero polynomial of degree at most d. Let us denote the set of zeroes of P in F^n by Z. Then |Z| \leq dq^{n-1}

For the first proof, which is chronologically the latest proof, we need an important lemma,

Lemma 1. If P \in F[x_1, \ldots, x_n] is a non-zero polynomial deg_{x_i}(P) < q for all i then there exists an a \in F^n such that P(a) \neq 0.

This can be easily proved by induction on n. Now, since the degree of P is at most d, the degree of each variable is also at most d. We can safely assume that d < q since otherwise the bound dq^{n-1} is trivial. So, let  a in F^n be such that P(a) \neq 0.

A line L in F^n is a collection of points of the form p + tv where p and v \neq 0 are fixed points of F^n and t runs over F (think of v as the direction vector of the line). From this it should be clear that a line L has precisely q points on it and that L intersects the set Z either in all points (when P(p + tv) is zero) or at most d points. Now consider the set of lines through a, \mathcal{L}_a = \{\{a + tb : t \in F\} : b \in F^n\}. Every point of F^n belongs to some line of \mathcal{L}_a, and in fact \mathcal{L}_a partitions the set F^n \setminus \{a\}. Since each line has q points on it we have |\mathcal{L}_a| = (q^n - 1)/(q-1)

Because each line L in \mathcal{L}_a has a point on it that doesn’t lie in Z, the point a, it intersects Z in at most d points. Therefore we have |Z| \leq d(q^n-1)/(q-1). This is asymptotically close to the bound given in Schwartz-Zippel lemma but not the bound we need. A similar idea, which is in fact more similar than it looks right away, leads to the correct bound (cf. [1]).

The points of F^n can be partitioned by the set of parallel lines \mathcal{L}_v in the direction of a given  vector v \in F^n \setminus \{0\}. Since each line has q points on it, we have |\mathcal{L}_v| = q^{n-1}. Now, if we can choose a direction v so that no line of \mathcal{L}_v is completely contained in Z then we are essentially done as each of these q^{n-1} lines would intersect Z in at most d points. Let P be of degree d and write P as P = P_{=d} + P_{<d} where P_{=d} is the degree d part of P. The coefficient of t^d in the univariate polynomial P(a + tb) is P_{=d}(b).

Therefore, if we pick v to be a vector for which P_{=d}(v) is not equal to zero then we can ensure that none of  the lines in \mathcal{L}_v are contained in Z. But why can we choose such a (non-zero) v?

This again follows from the important lemma we discussed before. Since P_{=d} is a homogenous non-zero polynomial of degree d < q, by Lemma 1 there must be a non-zero v with P_{=d}(v) \neq 0 (non-zero because P_{=d}(0) = 0).

Remark: In Moshkovitz’s proof we are also fixing a point that doesn’t lie in Z, a point x at infinity. But the crucial difference is that instead of looking at all lines through x we are only looking at those that do not lie in the hyperplane at infinity, i.e., the set of parallel lines that determine this point x. The accepted answer here  mistakenly assumes that the hyperplane at infinity has size q^{n-1} and hence doesn’t really give an alternate proof.

Now, we discuss the original proof which follows by induction on n. This has an additional advantage that it doesn’t require Lemma 1.

For n = 1 the lemma reduces to the factor theorem, so let n > 1. For induction to work, we would like to pick a value for variable x_n to get a polynomial of lower degree, but the only problem is that this polynomial might become 0. So, let E be the (possibly empty) set of elements t in F for which x_n - t divides P, which are precisely the values of t for which P_t = P(x_1, \ldots, x_{n-1}, t) becomes zero.

Then we can write P = Q\prod_{t \in E} (x_n - t). Since \deg(P) \leq d we have |E| \leq d. Let t' belong to F \setminus E, so that x_n - t' does not divide P. Then the polynomial Q_{t'} := Q(x_1, \ldots, x_{n-1}, t') must be non-zero. Since Q_{t'} is a non-zero polynomial of degree at most d - |E| in n-1 variables, we can apply the induction hypothesis to conclude that the number of zeroes of Q_{t'} is at most (d-|E|)q^{n-2}.

Now we can partition the set Z (zeroes of P) in two parts: those zeroes for which the value of x_n belongs to E and those for which it doesn’t. We get |E|q^{n-1} zeroes in the first case because for every t \in E we can assign x_n to be t and every other x_i to be any element of F. In the second case we have \sum_{t' \in F \setminus E} |Z(Q_{t'})| zeroes.

Therefore, |Z| = |E|q^{n-1} + \sum_{t' \in F \setminus E} (d-|E|)q^{n-2} \leq |E|q^{n-1} + q(d-|E|)q^{n-2} = dq^{n-1}.

Are there any other proofs? 

See this for a nice discussion on Moshkovit’s proof and this for an historical account of the Schwartz-Zippel lemma by Robert Lipton.

[1] Dana Moshkovitz. An alternative proof of the schwartz-zippel lemma. Electronic Colloquium on Computational Complexity, 2010. Available at http://eccc.hpi-web.de/report/2010/096/.

* this is a part of notes on the polynomial method that I am working on with Abhishek Khetan and Aditya Potukuchi

Edit (19 April 2015):  This Lemma dates back to 1922, originally proved by Ore. See the chapter notes at pg 320, on Theorem 6.13,  in “Finite Fields” by Lidl and Niederreiter. It also appears as a special case of a more general result in the PhD thesis of Daniel Edwin Erickson from 1974, “Counting Zeros of Polynomials over Finite Fields” :http://thesis.library.caltech.edu/4061/1/Erickson_de_1974.pdf. Put s = 0 in Theorem 5.1.

Edit (20 September 2015): In our new paper we have shown connections between a result of Alon-Furedi and Schwartz-Zippel lemma(s), which in particular, gives us several new proofs of this result: arXiv:1508.06020. So yes, there are many other proofs of the Schwartz-Zippel lemma.

Advertisements

About Anurag Bishnoi

Currently a maths PhD student at Ghent University working in the Incidence Geometry research group. I am broadly interested in combinatorics, finite geometry and group theory.
This entry was posted in Finite Geometry, Polynomial Method and tagged , , . Bookmark the permalink.

8 Responses to Two proofs of the Schwartz-Zippel lemma

  1. Pingback: The Kakeya problem | Anurag's Math Blog

  2. Fedor Petrov says:

    Can we prove it as follows or do I miss something or it is the same as some of above arguments?

    Induction in $n$ and $dd$ in any field.

    • There seems to be a typo, or perhaps I don’t understand you properly. Could you say something more about the induction you have in mind? How different would it be from the inductive proof above?

  3. Fedor Petrov says:

    Sorry, it probably contained some typo which deleted a post almost completely. I mean the following. If density of zeros exceeds $d/q$, it exceeds $d/q$ on some section $x_n=a$. It follows (induction in $n$) that $f$ restricted to this section is identical zero, i.e. $f=(x_n-a)g$. Next, density of zeros of $g$ does not exceed $(d-1)/q$. It looks similar to above, but looks shorter and needs only one multiple $x_n-a$.

    • So, you are doing induction on the degree d. Nice! I don’t think it’s any shorter, but it certainly looks like a better proof. Perhaps something like this can also be done by taking any other set of parallel hyperplanes that cover the whole space.

      • Fedor Petrov says:

        Yes, I induct both on $n$ and on $d$ for fixed $n$. It works for the products $\prod_{i=1}^n A_i$ with the same sizes $|A_i|=q$ in any field, (I guess that other proofs work too), but probably for some more general situations aswell. For example, the same argument allows to prove that polynomial $f(x,y,z)\in K[x,y,z]$ of degree $n$, which vanishes on all points $(x_i,y_j,z_k)$, $0\leq i,j,k; i+j+k \leq n$, vanishes on $K^3$ (here $\{x_0,\dots,x_n\}$, $\{y_0,y_1,\dots,y_n\}, \{z_0,\dots,z_n\}$ are sets of cardinality $n+1$ in a field $K$). The same for higher number of variables, 3 is chosen for simplicity of notations.

  4. Fedor Petrov says:

    To be rigorous: “vanishes on $K^3$” must be replaced to “vanishes identically” (as polynomial).
    By the way, how do you type latex code?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s