Skip to main content

Switching to Utterances Comments

I've ditched Disqus as the comment system on this blog. I am now using Utterances. Utterances is an open source comment system that is backed by GitHub issues. Basically, every post has a corresponding issue opened on GitHub, and the comments on the post are the comments on the issue. Utterances automatically places the comments at the bottom of the post. For example, here is the issue corresponding to this post.

I didn't like Disqus mostly because it serves ads and tracking. Even though I had opted out from as much of it as I could in the Disqus settings, it still loads tracking scripts on every page. I run uBlock Origin always, and it's a bit hypocritical if my own side has things that are blocked by it. In some cases I can't avoid it (as far as I know), like when I embed a YouTube video, but it definitely shouldn't be on every post.

Utterances is a very nice alternative. I has lots of advantages:

  • Comments are not self-hosted. GitHub hosts them. Since you need a GitHub account to comment, this should make comment spam a non-issue.
  • Comments support full Markdown.
  • Users can edit their comments.
  • I can edit and fully moderate all comments.
  • Users log in with a federated system that proves their identity.
  • Email subscription to posts.
  • No ads or tracking.
  • It's completely open source and free to use.

If you use Nikola like I do, it natively supports Utterances (a feature which I added). Otherwise, go to the Utterances and paste the script tag generated at the bottom into your blog template. Then install the Utterances app in your repo, and you are done.

Exporting Disqus Comments

Some of my old posts had Disqus comments, which I wanted to preserve somehow. Here is guide on how I did that, since it wasn't as straightforward as I would have hoped.

The first step is to export your Disqus comments. It's very difficult to actually find the place in the Disqus site where you do this, but I finally found the URL. The export takes some time to complete (for me it took about half an hour). When it finished, Disqus will email you an XML file with all your comments. Note that the file contains all comments for all sites you have ever set up with Disqus. For me, it also included all the comments on my old Wordpress blog, as well as posts for draft blog posts that I never ended up publishing. It also contained all comments that were marked as spam, so you will need to remember to filter those.

I decided that since I only have a handful of posts with Disqus comments, I would just write a script to process them all and manually print them out, which I will then manually enter in to the Utterances comment system for those posts.

I wrote a script to process the comments, which you can find here. Disqus does provides an XML schema for the XML. I used a library called xsData, which lets you take an XML scheme and generate Python dataclasses corresponding to it, which make manipulating the parsed XML much easier than the standard library xml library. The script outputs text like

========== Comments from ==========

These are the original comments on this post that were made when this blog used the [Disqus blog system](

>**Comment from bjd2385 on 2016-08-28 12:33:12+00:00:**

><p>Very interesting post. I was just looking into hash functions (I've never formally learned what the topic entails), and since I'm most familiar with Python this post explained quite a bit, especially your early mathematical points.</p>

>**Comment from Mark Lawrence on 2016-10-03 20:26:54+00:00:**

><p>At what point does Python 3 force the override of __hash__ if you've defined __eq__?  E.g when would your</p><p>AlwaysEqual class fail?</p>


>>**Comment from asmeurer on 2016-10-03 20:38:13+00:00:**

>><p>That's a good catch. I originally wrote this post in Python 2. The example does indeed fail in Python 3. More specifically, if you override __eq__, Python 3 automatically sets __hash__ to None. I'll update the post to make this more clear.</p>

>**Comment from Erick Mendonça on 2017-07-30 03:23:55+00:00:**

><p>Great article! We must really pay attention to these details when implementing custom hashes.</p>

>**Comment from Ignacio on 2017-10-07 22:31:56+00:00:**

><p>Thanks a lot for this post! Clarified a lot of concepts.</p>

which I then manually copied to each post's Utterances page on GitHub.

Feel free to adapt my script if you find yourself in a similar situation.

Utterances Comments

Feel free to use the comments on this page to play around with the commenting system.

Note that to comment, there are two options. You can log in on this page, which will let you type your comment in the box below. This requires giving the Utterances bot access to your GitHub account. Alternately, if you don't want to give a bot access, you can just go directly to the GitHub issue page and comment there. I am currently in the process of figuring out how to add some boilerplate to each page that makes this clear (see this Utterances issue). If anyone has any suggestions on how to do this, let me know. For now, I am just going to manually add a statement about this as the first comment on each post.

Verifying the Riemann Hypothesis with SymPy and mpmath

Like most people, I've had a lot of free time recently, and I've spent some of it watching various YouTube videos about the Riemann Hypothesis. I've collected the videos I've watched into YouTube playlist. The playlist is sorted with the most mathematically approachable videos first, so even if you haven't studied complex analysis before, you can watch the first few. If you have studied complex analysis, all the videos will be within your reach (none of them are highly technical with proofs). Each video contains parts that aren't in any of the other videos, so you will get something out of watching each of them.

One of the videos near the end of the playlist is a lecture by Keith Conrad. In it, he mentioned a method by which one could go about verifying the Riemann Hypothesis with a computer. I wanted to see if I could do this with SymPy and mpmath. It turns out you can.

Background Mathematics

Euler's Product Formula

Before we get to the computations, let's go over some mathematical background. As you may know, the Riemann Hypothesis is one of the 7 Millennium Prize Problems outlined by the Clay Mathematics Institute in 2000. The problems have gained some fame because each problem comes with a $1,000,000 prize if solved. One problem, the Poincaré conjecture, has already been solved (Grigori Perelman who solved it turned down the 1 million dollar prize). The remainder remain unsolved.

The Riemann Hypothesis is one of the most famous of these problems. The reason for this is that the problem is central many open questions in number theory. There are hundreds of theorems which are only known to be true contingent on the Riemann Hypothesis, meaning that if the Riemann Hypothesis were proven, immediately hundreds of theorems would be proven as well. Also, unlike some other Millennium Prize problems, like P=NP, the Riemann Hypothesis is almost universally believed to be true by mathematicians. So it's not a question of whether or not it is true, just one of how to actually prove it. The problem has been open for over 160 years, and while many advances have been made, no one has yet come up with a proof of it (crackpot proofs aside).

To understand the statement of the hypothesis, we must first define the zeta function. Let

$$\zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s}$$

(that squiggle $\zeta$ is the lowercase Greek letter zeta). This expression makes sense if $s$ is an integer greater than or equal to 2, $s=2, 3, 4, \ldots$, since we know from simple arguments from calculus that the summation converges in those cases (it isn't important for us what those values are, only that the summation converges). The story begins with Euler, who in 1740 considered the following infinite product:

$$\prod_{\text{$p$ prime}}\frac{1}{1 - \frac{1}{p^s}}.$$

The product ranges over all prime numbers, i.e., it is $$\left(\frac{1}{1 - \frac{1}{2^s}}\right)\cdot\left(\frac{1}{1 - \frac{1}{3^s}}\right)\cdot\left(\frac{1}{1 - \frac{1}{5^s}}\right)\cdots.$$ The fraction $\frac{1}{1 - \frac{1}{p}}$ may seem odd at first, but consider the famous geometric series formula, $$\sum_{k=0}^\infty r^k = \frac{1}{1 - r},$$ which is true for $|r| < 1$. Our fraction is exactly of this form, with $r = \frac{1}{p^s}$. So substituting, we have

$$\prod_{\text{$p$ prime}}\frac{1}{1 - \frac{1}{p^s}} = \prod_{\text{$p$ prime}}\sum_{k=0}^\infty \left(\frac{1}{p^s}\right)^k = \prod_{\text{$p$ prime}}\sum_{k=0}^\infty \left(\frac{1}{p^k}\right)^s.$$

Let's take a closer look at what this is. It is

$$\left(1 + \frac{1}{p_1^s} + \frac{1}{p_1^{2s}} + \frac{1}{p_1^{3s}} + \cdots\right)\cdot\left(1 + \frac{1}{p_2^s} + \frac{1}{p_2^{2s}} + \frac{1}{p_2^{3s}} + \cdots\right)\cdot\left(1 + \frac{1}{p_3^s} + \frac{1}{p_3^{2s}} + \frac{1}{p_3^{3s}} + \cdots\right)\cdots,$$

where $p_1$ is the first prime, $p_2$ is the second prime, and so on. Now think about how to expand finite products of finite sums, for instance, $$(x_1 + x_2 + x_3)(y_1 + y_2 + y_3)(z_1 + z_2 + z_3).$$ To expand the above, you would take a sum of every combination where you pick one $x$ term, one $y$ term, and one $z$ term, giving

$$x_1y_1z_1 + x_1y_1z_2 + \cdots + x_2y_1z_3 + \cdots + x_3y_2z_1 + \cdots + x_3y_3z_3.$$

So to expand the infinite product, we do the same thing. We take every combination of picking $1/p_i^{ks}$, with one $k$ for each $i$. If we pick infinitely many non-$1$ powers, the product will be zero, so we only need to consider terms where there are finitely many primes. The resulting sum will be something like

$$\frac{1}{1^s} + \frac{1}{p_1^s} + \frac{1}{p_2^s} + \frac{1}{\left(p_1^2\right)^s} + \frac{1}{p_3^s} + \frac{1}{\left(p_1p_2\right)^s} + \cdots,$$

where each prime power combination is picked exactly once. However, we know by the Fundamental Theorem of Arithmetic that when you take all combinations of products of primes that you get each positive integer exactly once. So the above sum is just

$$\frac{1}{1^s} + \frac{1}{2^s} + \frac{1}{3^s} + \cdots,$$ which is just $\zeta(s)$ as we defined it above.

In other words,

$$\zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s} = \prod_{\text{$p$ prime}}\frac{1}{1 - \frac{1}{p^s}},$$ for $s = 2, 3, 4, \ldots$. This is known as Euler's product formula for the zeta function. Euler's product formula gives us our first clue as to why the zeta function can give us insights into prime numbers.

Analytic Continuation

In 1859, Bernhard Riemann wrote a short 9 page paper on number theory and the zeta function. It was the only paper Riemann ever wrote on the subject of number theory, but it is undoubtedly one of the most important papers every written on the subject.

In the paper, Riemann considered that the zeta function summation,

$$\zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s},$$

makes sense not just for integers $s = 2, 3, 4, \ldots$, but for any real number $s > 1$ (if $s = 1$, the summation is the harmonic series, which famously diverges). In fact, it is not hard to see that for complex $s$, the summation makes sense so long as $\mathrm{Re}(s) > 1$ (for more about what it even means for $s$ to be complex in that formula, and the basic ideas of analytic continuation, I recommend 3Blue1Brown's video from my YouTube playlist).

Riemann wanted to extend this function to the entire complex plane, not just $\mathrm{Re}(s) > 1$. The process of doing this is called analytic continuation. The theory of complex analysis tells us that if we can find an extension of $\zeta(s)$ to the whole complex plan that remains differentiable, then that extension is unique, and we can reasonably say that that is the definition of the function everywhere.

Riemann used the following approach. Consider what we might call the "completed zeta function"

$$Z(s) = \pi^{-\frac{s}{2}}\Gamma\left(\frac{s}{2}\right)\zeta(s).$$

Using Fourier analysis, Riemann gave a formula for $Z(s)$ that is defined everywhere, allowing us to use it to define $\zeta(s)$ to the left of 1. I won't repeat Riemann's formula for $Z(s)$ as the exact formula isn't important, but from it one could also see

  1. $Z(s)$ is defined everywhere in the complex plane, except for simple poles at 0 and 1.

  2. $Z(s) = Z(1 - s).$ This means if we have a value for $s$ that is right of the line $\mathrm{Re}(z) = \frac{1}{2},$ we can get a value to the left of it by reflecting it over the real-axis and the line at $\frac{1}{2}$ (to see this, note that the average of $s$ and $1 - s$ is $1/2$, so the midpoint of a line connecting the two should always go through the point $1/2$).

Reflection of s and 1 - s

(Reflection of $s$ and $1 - s$. Created with Geogebra)


Looking at $Z(s)$, it is a product of three parts. So the zeros and poles of $Z(s)$ correspond to the zeros and poles of these parts, unless they cancel. $\pi^{-\frac{s}{2}}$ is the easiest: it has no zeros and no poles. The second part is the gamma function. $\Gamma(z)$ has no zeros and has simple poles at nonpositive integers $z=0, -1, -2, \ldots$.

So taking this, along with the fact that $Z(s)$ is entire except for simple poles at 0 and 1, we get from $$\zeta(s) = \frac{Z(s)}{\pi^{-\frac{s}{2}}\Gamma\left(\frac{s}{2}\right)}$$

  1. $Z(s)$ has a simple pole at 1, which means that $\zeta(s)$ does as well. This is not surprising, since we already know the summation formula from above diverges as $s$ approaches 1.
  2. $Z(s)$ has a simple pole at 0. Since $\Gamma\left(\frac{s}{2}\right)$ also has a simple pole at 0, they must cancel and $\zeta(s)$ must have neither a zero nor a pole at 0 (in fact, $\zeta(0) = -1/2$).
  3. Since $\Gamma\left(\frac{s}{2}\right)$ has no zeros, there are no further poles of $\zeta(s)$. Thus, $\zeta(s)$ is entire everywhere except for a simple pole at $s=1$.
  4. $\Gamma\left(\frac{s}{2}\right)$ has poles at the remaining negative even integers. Since $Z(s)$ has no poles there, these must correspond to zeros of $\zeta(s)$. These are the so-called "trivial" zeros of the zeta function, at $s=-2, -4, -6, \ldots$. The term "trivial" here is a relative one. They are trivial to see from the above formula, whereas other zeros of $\zeta(s)$ are much harder to find.
  5. $\zeta(s) \neq 0$ if $\mathrm{Re}(s) > 1$. One way to see this is from the Euler product formula. Since each term in the product is not zero, the function itself cannot be zero (this is a bit hand-wavy, but it can be made rigorous). This implies that $Z(s) \neq 0$ in this region as well. We can reflect $\mathrm{Re}(s) > 1$ over the line at $\frac{1}{2}$ by considering $\zeta(1 - s)$. Using the above formula and the fact that $Z(s) = Z(1 - s)$, we see that $\zeta(s)$ cannot be zero for $\mathrm{Re}(s) < 0$ either, with the exception of the aforementioned trivial zeros at $s=-2, -4, -6, \ldots$.

Thus, any non-trivial zeros of $\zeta(s)$ must have real part between 0 and 1. This is the so-called "critical strip". Riemann hypothesized that these zeros are not only between 0 and 1, but are in fact on the line dividing the strip at real part equal to $1/2$. This line is called the "critical line". This is Riemann's famous hypothesis: that all the non-trivial zeros of $\zeta(s)$ have real part equal to $1/2$.

Computational Verification

Whenever you have a mathematical hypothesis, it is good to check if it is true numerically. Riemann himself used some methods (not the same ones we use here) to numerically estimate the first few non-trivial zeros of $\zeta(s)$, and found that they lied on the critical line, hence the motivation for his hypothesis. Here is an English translation of his original paper if you are interested.

If we verified that all the zeros in the critical strip from, say, $\mathrm{Im}(s) = 0$ to $\mathrm{Im}(s) = N$ are in fact on the critical line for some large $N$, then it would give evidence that the Riemann Hypothesis is true. However, to be sure, this would not constitute a proof. Hardy showed in 1914 that $\zeta(s)$ has infinitely many zeros on the critical strip, so only finding finitely many of them would not suffice as a proof. (Although if we were to find a counter-example, a zero not on the critical line, that WOULD constitute a proof that the Hypothesis is false. However, there are strong reasons to believe that the hypothesis is not false, so this would be unlikely to happen.)

How would we verify that the zeros are all on the line $1/2$. We can find zeros of $\zeta(s)$ numerically, but how would we know if the real part is really exactly 0.5 and not 0.500000000000000000000000000000000001? And more importantly, just because we find some zeros, it doesn't mean that we have all of them. Maybe we can find a bunch of zeros on the critical line, but how would we be sure that there aren't other zeros lurking around elsewhere on the critical strip?

We want to find rigorous answers to these two questions:

  1. How can we count the number of zeros between $\mathrm{Im}(s) = 0$ and $\mathrm{Im}(s) = N$ of $\zeta(s)$?

  2. How can we verify that all those zeros lie on the critical line, that is, they have real part equal to exactly $1/2$?

Counting Zeros Part 1

To answer the first question, we can make use of a powerful theorem from complex analysis called the argument principle. The argument principle says that if $f$ is a meromorphic function on some closed contour $C$, and does not have any zeros or poles on $C$ itself, then

$$\frac{1}{2\pi i}\oint_C \frac{f'(z)}{f(z)}\,dz = \#\left\{\text{zeros of $f$ inside of C}\right\} - \#\left\{\text{poles of $f$ inside of C}\right\},$$ where all zeros and poles are counted with multiplicity.

In other words, the integral on the left-hand side counts the number of zeros of $f$ minus the number of poles of $f$ in a region. The argument principle is quite easy to show given the Cauchy residue theorem (see the above linked Wikipedia article for a proof). The expression $f'(z)/f(z)$ is called the "logarithmic derivative of $f$", because it equals $\frac{d}{dz} \log(f(z))$ (although it makes sense even without defining what "$\log$" means).

One should take a moment to appreciate the beauty of this result. The left-hand side is an integral, something we generally think of as being a continuous quantity. But it is always exactly equal to an integer. Results such as these give us a further glimpse at how analytic functions and complex analysis can produce theorems about number theory, a field which one would naively think can only be studied via discrete means. In fact, these methods are far more powerful than discrete methods. For many results in number theory, we only know how to prove them using complex analytic means. So-called "elementary" proofs for these results, or proofs that only use discrete methods and do not use complex analysis, have not yet been found.

Practically speaking, the fact that the above integral is exactly an integer means that if we compute it numerically and it comes out to something like 0.9999999, we know that it must in fact equal exactly 1. So as long as we get a result that is near an integer, we can round it to the exact answer.

We can integrate a contour along the critical strip up to some $\mathrm{Im}(s) = N$ to count the number of zeros up to $N$ (we have to make sure to account for the poles. I go into more details about this when I actually compute the integral below).

Counting Zeros Part 2

So using the argument principle, we can count the number of zeros in a region. Now how can we verify that they all lie on the critical line? The answer lies in the $Z(s)$ function defined above. By the points outlined in the previous section, we can see that $Z(s)$ is zero exactly where $\zeta(s)$ is zero on the critical strip, and it is not zero anywhere else. In other words,

the zeros of $Z(s)$ are exactly the non-trivial zeros of $\zeta(s)$.

This helps us because $Z(s)$ has a nice property on the critical line. First we note that $Z(s)$ commutes with conjugation, that is $\overline{Z(s)} = Z(\overline{s})$ (this isn't obvious from what I have shown, but it is true). On the critical line $\frac{1}{2} + it$, we have

$$\overline{Z\left(\frac{1}{2} + it\right)} = Z\left(\overline{\frac{1}{2} + it}\right) = Z\left(\frac{1}{2} - it\right).$$

However, $Z(s) = Z(1 - s)$, and $1 - \left(\frac{1}{2} - it\right) = \frac{1}{2} + it$, so

$$\overline{Z\left(\frac{1}{2} + it\right)} = Z\left(\frac{1}{2} + it\right),$$

which means that $Z\left(\frac{1}{2} + it\right)$ is real valued for real $t$.

This simplifies things a lot, because it is much easier to find zeros of a real function. In fact, we don't even care about finding the zeros, only counting them. Since $Z(s)$ is continuous, we can use a simple method: counting sign changes. If a continuous real function changes signs from negative to positive or from positive to negative n times in an interval, then it must have at least n zeros in that interval. It may have more, for instance, if some zeros are clustered close together, or if a zero has a multiplicity greater than 1, but we know that there must be at least n.

So our approach to verifying the Riemann Hypothesis is as such:

  1. Integrate $\frac{1}{2\pi i}\oint_C Z'(s)/Z(s)\,ds$ along a contour $C$ that runs along the critical strip up to some $\mathrm{Im}(s) = N$. The integral will tell us there are exactly $n$ zeros in the contour, counting multiplicity.

  2. Try to find $n$ sign changes of $Z(1/2 + it)$ for $t\in [0, N]$. If we can find $n$ of them, we are done. We have confirmed all the zeros are on the critical line.

Step 2 would fail if the Riemann Hypothesis is false, in which case a zero wouldn't be on the critical line. But it would also fail if a zero has a multiplicity greater than 1, since the integral would count it more times than the sign changes. Fortunately, as it turns out, the Riemann Hypothesis has been verified up to N = 10000000000000, and no one has yet found a zero of the zeta function yet that has a multiplicity greater than 1, so we should not expect that to happen (no one has yet found a counterexample to the Riemann Hypothesis either).

Verification with SymPy and mpmath

We now use SymPy and mpmath to compute the above quantities. We use SymPy to do symbolic manipulation for us, but the heavy work is done by mpmath. mpmath is a pure Python library for arbitrary precision numerics. It is used by SymPy under the hood, but it will be easier for us to use it directly. It can do, among other things, numeric integration. When I first tried to do this, I tried using the scipy.special zeta function, but unfortunately, it does not support complex arguments.

First we do some basic imports

>>> from sympy import *
>>> import mpmath
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> s = symbols('s')

Define the completed zeta function $Z = \pi^{-s/2}\Gamma(s/2)\zeta(s)$.

>>> Z = pi**(-s/2)*gamma(s/2)*zeta(s)

We can verify that Z is indeed real for $s = \frac{1}{2} + it.$

>>> Z.subs(s, 1/2 + 0.5j).evalf()
-1.97702795164031 + 5.49690501450151e-17*I

We get a small imaginary part due to the way floating point arithmetic works. Since it is below 1e-15, we can safely ignore it.

D will be the logarithmic derivative of Z.

>>> D = simplify(Z.diff(s)/Z)
>>> D
polygamma(0, s/2)/2 - log(pi)/2 + Derivative(zeta(s), s)/zeta(s)

This is $$\frac{\operatorname{polygamma}{\left(0,\frac{s}{2} \right)}}{2} - \frac{\log{\left(\pi \right)}}{2} + \frac{ \zeta'\left(s\right)}{\zeta\left(s\right)}.$$

Note that logarithmic derivatives behave similar to logarithms. The logarithmic derivative of a product is the sum of logarithmic derivatives (the $\operatorname{polygamma}$ function is the derivative of $\Gamma$).

We now use lambdify to convert the SymPy expressions Z and D into functions that are evaluated using mpmath. A technical difficulty here is that the derivative of the zeta function $\zeta'(s)$ does not have a closed-form expression. mpmath's zeta can evaluate $\zeta'$ but it doesn't yet work with sympy.lambdify (see SymPy issue 11802). So we have to manually define "Derivative" in lambdify, knowing that it will be the derivative of zeta when it is called. Beware that this is only correct for this specific expression where we know that Derivative will be Derivative(zeta(s), s).

>>> Z_func = lambdify(s, Z, 'mpmath')
>>> D_func = lambdify(s, D, modules=['mpmath',
...     {'Derivative': lambda expr, z: mpmath.zeta(z, derivative=1)}])

Now define a function to use the argument principle to count the number of zeros up to $Ni$. Due to the symmetry $Z(s) = Z(1 - s)$, it is only necessary to count zeros in the top half-plane.

We have to be careful about the poles of $Z(s)$ at 0 and 1. We can either integrate right above them, or expand the contour to include them. I chose to do the former, starting at $0.1i$. It is known that there $\zeta(s)$ has no zeros near the real axis on the critical strip. I could have also expanded the contour to go around 0 and 1, and offset the result by 2 to account for the integral counting those points as poles.

It has also been shown that there are no zeros on the lines $\mathrm{Re}(s) = 0$ or $\mathrm{Re}(s) = 1$, so we do not need to worry about that. If the upper point of our contour happens to have zeros exactly on it, we would be very unlucky, but even if this were to happen we could just adjust it up a little bit.

Our contour

(Our contour with $N=10$. Created with Geogebra)

mpmath.quad can take a list of points to compute a contour. The maxdegree parameter allows us to increase the degree of the quadrature if it becomes necessary to get an accurate result.

>>> def argument_count(func, N, maxdegree=6):
...     return 1/(2*mpmath.pi*1j)*(mpmath.quad(func,
...         [1 + 0.1j, 1 + N*1j, 0 + N*1j, 0 + 0.1j,  1 + 0.1j],
...         maxdegree=maxdegree))

Now let's test it. Lets count the zeros of $$s^2 - s + 1/2$$ in the box bounded by the above rectangle ($N = 10$).

>>> expr = s**2 - s + S(1)/2
>>> argument_count(lambdify(s, expr.diff(s)/expr), 10)
mpc(real='1.0', imag='3.4287545414000525e-24')

The integral is 1. We can confirm there is indeed one zero in this box, at $\frac{1}{2} + \frac{i}{2}$.

>>> solve(s**2 - s + S(1)/2)
[1/2 - I/2, 1/2 + I/2]

Now compute points of $Z$ along the critical line so we can count the sign changes. We also make provisions in case we have to increase the precision of mpmath to get correct results here. dps is the number of digits of precision the values are computed to. The default is 15, but mpmath can compute values to any number of digits. mpmath.chop zeros out values that are close to 0, which removes any numerically insignificant imaginary parts that arise from the floating point evaluation.

>>> def compute_points(Z_func, N, npoints=10000, dps=15):
...     import warnings
...     old_dps =
...     points = np.linspace(0, N, npoints)
...     try:
... = dps
...         L = [mpmath.chop(Z_func(i)) for i in 1/2 + points*1j]
...     finally:
... = old_dps
...     if L[-1] == 0:
...         # mpmath will give 0 if the precision is not high enough, since Z
...         # decays rapidly on the critical line.
...         warnings.warn("You may need to increase the precision")
...     return L

Next define a function to count the number of sign changes in a list of real values.

>>> def sign_changes(L):
...     """
...     Count the number of sign changes in L
...     Values of L should all be real.
...     """
...     changes = 0
...     assert im(L[0]) == 0, L[0]
...     s = sign(L[0])
...     for i in L[1:]:
...         assert im(i) == 0, i
...         s_ = sign(i)
...         if s_ == 0:
...             # Assume these got chopped to 0
...             continue
...         if s_ != s:
...             changes += 1
...         s = s_
...     return changes

For example, for $\sin(s)$ from -10 to 10, there are 7 zeros ($3\pi\approx 9.42$).

>>> sign_changes(lambdify(s, sin(s))(np.linspace(-10, 10)))

Now we can check how many zeros of $Z(s)$ (and hence non-trivial zeros of $\zeta(s)$) we can find. According to Wikipedia, the first few non-trivial zeros of $\zeta(s)$ in the upper half-plane are 14.135, 21.022, and 25.011.

First try up to $N=20$.

>>> argument_count(D_func, 20)
mpc(real='0.99999931531867581', imag='-3.2332902529067346e-24')

Mathematically, the above value must be an integer, so we know it is 1.

Now check the number of sign changes of $Z(s)$ from $\frac{1}{2} + 0i$ to $\frac{1}{2} + 20i$.

>>> L = compute_points(Z_func, 20)
>>> sign_changes(L)

So it checks out. There is one zero between $0$ and $20i$ on the critical strip, and it is in fact on the critical line, as expected!

Now let's verify the other two zeros from Wikipedia.

>>> argument_count(D_func, 25)
mpc(real='1.9961479945577916', imag='-3.2332902529067346e-24')
>>> L = compute_points(Z_func, 25)
>>> sign_changes(L)
>>> argument_count(D_func, 30)
mpc(real='2.9997317058520916', imag='-3.2332902529067346e-24')
>>> L = compute_points(Z_func, 30)
>>> sign_changes(L)

Both check out as well.

Since we are computing the points, we can go ahead and make a plot as well. However, there is a technical difficulty. If you naively try to plot $Z(1/2 + it)$, you will find that it decays rapidly, so fast that you cannot really tell where it crosses 0:

>>> def plot_points_bad(L, N):
...     npoints = len(L)
...     points = np.linspace(0, N, npoints)
...     plt.figure()
...     plt.plot(points, L)
...     plt.plot(points, [0]*npoints, linestyle=':')
>>> plot_points_bad(L, 30)

So instead of plotting $Z(1/2 + it)$, we plot $\log(|Z(1/2 + it)|)$. The logarithm will make the zeros go to $-\infty$, but these will be easy to see.

>>> def plot_points(L, N):
...     npoints = len(L)
...     points = np.linspace(0, N, npoints)
...     p = [mpmath.log(abs(i)) for i in L]
...     plt.figure()
...     plt.plot(points, p)
...     plt.plot(points, [0]*npoints, linestyle=':')
>>> plot_points(L, 30)

The spikes downward are the zeros.

Finally, let's check up to N=100. OEIS A072080 gives the number of zeros of $\zeta(s)$ in upper half-plane up to $10^ni$. According to it, we should get 29 zeros between $0$ and $100i$.

>>> argument_count(D_func, 100)
mpc(real='28.248036536895913', imag='-3.2332902529067346e-24')

This is not near an integer. This means we need to increase the precision of the quadrature (the maxdegree argument).

>>> argument_count(D_func, 100, maxdegree=9)
mpc(real='29.000000005970151', imag='-3.2332902529067346e-24')

And the sign changes...

>>> L = compute_points(Z_func, 100)
__main__:11: UserWarning: You may need to increase the precision

Our guard against the precision being too low was triggered. Try raising it (the default dps is 15).

>>> L = compute_points(Z_func, 100, dps=50)
>>> sign_changes(L)

They both give 29. So we have verified the Riemann Hypothesis up to $100i$!

Here is a plot of these 29 zeros.

>>> plot_points(L, 100)

(remember that the spikes downward are the zeros)


$N=100$ takes a few minutes to compute, and I imagine larger and larger values would require increasing the precision more, slowing it down even further, so I didn't go higher than this. But it is clear that this method works.

This was just me playing around with SymPy and mpmath, but if I wanted to actually verify the Riemann Hypothesis, I would try to find a more efficient method of computing the above quantities. For the sake of simplicity, I used $Z(s)$ for both the argument principle and sign changes computations, but it would have been more efficient to use $\zeta(s)$ for the argument principle integral, since it has a simpler formula. It would also be useful if there were a formula with similar properties to $Z(s)$ (real on the critical line with the same zeros as $\zeta(s)$), but that did not decay as rapidly.

Furthermore, for the argument principle integral, I would like to see precise error estimates for the integral. We saw above with $N=100$ with the default quadrature that we got a value of 28.248, which is not close to an integer. This tipped us off that we should increase the quadrature, which ended up giving us the right answer, but if the original number happened to be close to an integer, we might have been fooled. Ideally, one would like know the exact quadrature degree needed. If you can get error estimates guaranteeing the error for the integral will be less than 0.5, you can always round the answer to the nearest integer. For the sign changes, you don't need to be as rigorous, because simply seeing as many sign changes as you have zeros is sufficient. However, one could certainly be more efficient in computing the values along the interval, rather than just naively computing 10000 points and raising the precision until it works, as I have done.

One would also probably want to use a faster integrator than mpmath (like one written in C), and perhaps also find a faster to evaluate expression than the one I used for $Z(s)$. It is also possible that one could special-case the quadrature algorithm knowing that it will be computed on $\zeta'(s)/\zeta(s)$.

In this post I described the Riemann zeta function and the Riemann Hypothesis, and showed how to computationally verify it. But I didn't really go over the details of why the Riemann Hypothesis matters. I encourage you to watch the videos in my YouTube playlist if you want to know this. Among other things, the truth of the Riemann Hypothesis would give a very precise bound on the distribution of prime numbers. Also, the non-trivial zeros of $\zeta(s)$ are, in some sense, the "spectrum" of the prime numbers, meaning they exactly encode the position of every prime on the number line.

Quansight Labs Work Update for September, 2019

This post has been cross-posted on the Quansight Labs Blog.

As of November, 2018, I have been working at Quansight. Quansight is a new startup founded by the same people who started Anaconda, which aims to connect companies and open source communities, and offers consulting, training, support and mentoring services. I work under the heading of Quansight Labs. Quansight Labs is a public-benefit division of Quansight. It provides a home for a "PyData Core Team" which consists of developers, community managers, designers, and documentation writers who build open-source technology and grow open-source communities around all aspects of the AI and Data Science workflow.

My work at Quansight is split between doing open source consulting for various companies, and working on SymPy. SymPy, for those who do not know, is a symbolic mathematics library written in pure Python. I am the lead maintainer of SymPy.

In this post, I will detail some of the open source work that I have done recently, both as part of my open source consulting, and as part of my work on SymPy for Quansight Labs.

Bounds Checking in Numba

As part of work on a client project, I have been working on contributing code to the numba project. Numba is a just-in-time compiler for Python. It lets you write native Python code and with the use of a simple @jit decorator, the code will be automatically sped up using LLVM. This can result in code that is up to 1000x faster in some cases:

In [1]: import numba

In [2]: import numpy

In [3]: def test(x):
   ...:     A = 0
   ...:     for i in range(len(x)):
   ...:         A += i*x[i]
   ...:     return A

In [4]: @numba.njit
   ...: def test_jit(x):
   ...:     A = 0
   ...:     for i in range(len(x)):
   ...:         A += i*x[i]
   ...:     return A

In [5]: x = numpy.arange(1000)

In [6]: %timeit test(x)
249 µs ± 5.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [7]: %timeit test_jit(x)
336 ns ± 0.638 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [8]: 249/.336
Out[8]: 741.0714285714286

Numba only works for a subset of Python code, and primarily targets code that uses NumPy arrays.

Numba, with the help of LLVM, achieves this level of performance through many optimizations. One thing that it does to improve performance is to remove all bounds checking from array indexing. This means that if an array index is out of bounds, instead of receiving an IndexError, you will get garbage, or possibly a segmentation fault.

>>> import numpy as np
>>> from numba import njit
>>> def outtabounds(x):
...     A = 0
...     for i in range(1000):
...         A += x[i]
...     return A
>>> x = np.arange(100)
>>> outtabounds(x) # pure Python/NumPy behavior
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in outtabounds
IndexError: index 100 is out of bounds for axis 0 with size 100
>>> njit(outtabounds)(x) # the default numba behavior

In numba pull request #4432, I am working on adding a flag to @njit that will enable bounds checks for array indexing. This will remain disabled by default for performance purposes. But you will be able to enable it by passing boundscheck=True to @njit, or by setting the NUMBA_BOUNDSCHECK=1 environment variable. This will make it easier to detect out of bounds issues like the one above. It will work like

>>> @njit(boundscheck=True)
... def outtabounds(x):
...     A = 0
...     for i in range(1000):
...         A += x[i]
...     return A
>>> x = np.arange(100)
>>> outtabounds(x) # numba behavior in my pull request #4432
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: index is out of bounds

The pull request is still in progress, and many things such as the quality of the error message reporting will need to be improved. This should make debugging issues easier for people who write numba code once it is merged.


removestar is a new tool I wrote to automatically replace import * in Python modules with explicit imports.

For those who don't know, Python's import statement supports so-called "wildcard" or "star" imports, like

from sympy import *

This will import every public name from the sympy module into the current namespace. This is often useful because it saves on typing every name that is used in the import line. This is especially useful when working interactively, where you just want to import every name and minimize typing.

However, doing from module import * is generally frowned upon in Python. It is considered acceptable when working interactively at a python prompt, or in files (removestar skips files by default).

Some reasons why import * is bad:

  • It hides which names are actually imported.
  • It is difficult both for human readers and static analyzers such as pyflakes to tell where a given name comes from when import * is used. For example, pyflakes cannot detect unused names (for instance, from typos) in the presence of import *.
  • If there are multiple import * statements, it may not be clear which names come from which module. In some cases, both modules may have a given name, but only the second import will end up being used. This can break people's intuition that the order of imports in a Python file generally does not matter.
  • import * often imports more names than you would expect. Unless the module you import defines __all__ or carefully dels unused names at the module level, import * will import every public (doesn't start with an underscore) name defined in the module file. This can often include things like standard library imports or loop variables defined at the top-level of the file. For imports from modules (from, from module import * will include every submodule defined in that module. Using __all__ in modules and files is also good practice, as these things are also often confusing even for interactive use where import * is acceptable.
  • In Python 3, import * is syntactically not allowed inside of a function definition.

Here are some official Python references stating not to use import * in files:

  • The official Python FAQ:

    In general, don’t use from modulename import *. Doing so clutters the importer’s namespace, and makes it much harder for linters to detect undefined names.

  • PEP 8 (the official Python style guide):

    Wildcard imports (from <module> import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.

Unfortunately, if you come across a file in the wild that uses import *, it can be hard to fix it, because you need to find every name in the file that is imported from the * and manually add an import for it. Removestar makes this easy by finding which names come from * imports and replacing the import lines in the file automatically.

As an example, suppose you have a module mymod like



# mymod/
from .b import *

def func(x):
    return x + y


# mymod/
x = 1
y = 2

Then removestar works like:

$ removestar -i mymod/
$ cat mymod/
# mymod/
from .b import y

def func(x):
    return x + y

The -i flag causes it to edit in-place. Without it, it would just print a diff to the terminal.

For implicit star imports and explicit star imports from the same module, removestar works statically, making use of pyflakes. This means none of the code is actually executed. For external imports, it is not possible to work statically as external imports may include C extension modules, so in that case, it imports the names dynamically.

removestar can be installed with pip or conda:

pip install removestar

or if you use conda

conda install -c conda-forge removestar


In SymPy, we make heavy use of LaTeX math in our documentation. For example, in our special functions documentation, most special functions are defined using a LaTeX formula, like The docs for besselj


However, the source for this math in the docstring of the function uses RST syntax:

class besselj(BesselBase):
    Bessel function of the first kind.

    The Bessel `J` function of order `\nu` is defined to be the function
    satisfying Bessel's differential equation

    .. math ::
        z^2 \frac{\mathrm{d}^2 w}{\mathrm{d}z^2}
        + z \frac{\mathrm{d}w}{\mathrm{d}z} + (z^2 - \nu^2) w = 0,

    with Laurent expansion

    .. math ::
        J_\nu(z) = z^\nu \left(\frac{1}{\Gamma(\nu + 1) 2^\nu} + O(z^2) \right),

    if :math:`\nu` is not a negative integer. If :math:`\nu=-n \in \mathbb{Z}_{<0}`
    *is* a negative integer, then the definition is

    .. math ::
        J_{-n}(z) = (-1)^n J_n(z).

Furthermore, in SymPy's documentation we have configured it so that text between `single backticks` is rendered as math. This was originally done for convenience, as the alternative way is to write :math:`\nu` every time you want to use inline math. But this has lead to many people being confused, as they are used to Markdown where `single backticks` produce code.

A better way to write this would be if we could delimit math with dollar signs, like $\nu$. This is how things are done in LaTeX documents, as well as in things like the Jupyter notebook.

With the new sphinx-math-dollar Sphinx extension, this is now possible. Writing $\nu$ produces $\nu$, and the above docstring can now be written as

class besselj(BesselBase):
    Bessel function of the first kind.

    The Bessel $J$ function of order $\nu$ is defined to be the function
    satisfying Bessel's differential equation

    .. math ::
        z^2 \frac{\mathrm{d}^2 w}{\mathrm{d}z^2}
        + z \frac{\mathrm{d}w}{\mathrm{d}z} + (z^2 - \nu^2) w = 0,

    with Laurent expansion

    .. math ::
        J_\nu(z) = z^\nu \left(\frac{1}{\Gamma(\nu + 1) 2^\nu} + O(z^2) \right),

    if $\nu$ is not a negative integer. If $\nu=-n \in \mathbb{Z}_{<0}$
    *is* a negative integer, then the definition is

    .. math ::
        J_{-n}(z) = (-1)^n J_n(z).

We also plan to add support for $$double dollars$$ for display math so that .. math :: is no longer needed either .

For end users, the documentation on will continue to render exactly the same, but for developers, it is much easier to read and write.

This extension can be easily used in any Sphinx project. Simply install it with pip or conda:

pip install sphinx-math-dollar


conda install -c conda-forge sphinx-math-dollar

Then enable it in your

extensions = ['sphinx_math_dollar', 'sphinx.ext.mathjax']

Google Season of Docs

The above work on sphinx-math-dollar is part of work I have been doing to improve the tooling around SymPy's documentation. This has been to assist our technical writer Lauren Glattly, who is working with SymPy for the next three months as part of the new Google Season of Docs program. Lauren's project is to improve the consistency of our docstrings in SymPy. She has already identified many key ways our docstring documentation can be improved, and is currently working on a style guide for writing docstrings. Some of the issues that Lauren has identified require improved tooling around the way the HTML documentation is built to fix. So some other SymPy developers and I have been working on improving this, so that she can focus on the technical writing aspects of our documentation.

Lauren has created a draft style guide for documentation at Please take a moment to look at it and if you have any feedback on it, comment below or write to the SymPy mailing list.