You are currently browsing the category archive for the ‘polymath’ category.

Thomas Bloom’s erdosproblems.com site hosts nearly a thousand questions that originated, or were communicated by, Paul Erdős, as well as the current status of these questions (about a third of which are currently solved). The site is now a couple years old, and has been steadily adding features, the most recent of which has been a discussion forum for each individual question. For instance, a discussion I had with Stijn Cambie and Vjeko Kovac on one of these problems recently led to it being solved (and even formalized in Lean!).

A significantly older site is the On-line Encyclopedia of Integer Sequences (OEIS), which records hundreds of thousands of integer sequences that have some mathematician has encountered at some point. It is a highly useful resource, enabling researchers to discover relevant literature for a given problem so long as they can calculate enough of some integer sequence that is “canonically” attached to that problem that they can search for it in the OEIS.

A large fraction of problems in the Erdos problem webpage involve (either explicitly or implicitly) some sort of integer sequence – typically the largest or smallest size {f(n)} of some {n}-dependent structure (such as a graph of {n} vertices, or a subset of {\{1,\dots,n\}}) that obeys a certain property. In some cases, the sequence is already in the OEIS, and is noted in the Erdos problem web page. But in a large number of cases, the sequence either has not yet been entered into the OEIS, or it does appear but has not yet been noted on the Erdos web page.

Thomas Bloom and I are therefore proposing a crowdsourced project to systematically compute the hundreds of sequences associated to the Erdos problems and cross-check them against the OEIS. We have created a github repository to coordinate this process; as a by-product, this repository will also be tracking other relevant statistics about the Erdos problem website, such as the current status of formalizing the statements of these problems in the Formal Conjectures Repository.

The main feature of our repository is a large table recording the current status of each Erdos problem. For instance, Erdos problem #3 is currently listed as open, and additionally has the status of linkage with the OEIS listed as “possible”. This means that there are one or more sequences attached to this problem which *might* already be in the OEIS, or would be suitable for submission to the OEIS. Specifically, if one reads the commentary for that problem, one finds mention of the functions {r_k(N)} for {k=3,4,\dots}, defined as the size of the largest subset of {\{1,\dots,N\}} without a {k}-term progression. It is likely that several of the sequences {r_3(N)}, {r_4(N)}, etc. are in the OEIS, but it is a matter of locating them, either by searching for key words, or by calculating the first few values of these sequences and then looking for a match. (EDIT: a contributor has noted that the first foursequences appear as A003002, A003003, A003004, and A003005 in the OEIS, and the table has been updated accordingly.)

We have set things up so that new contributions (such as the addition of an OEIS number to the table) can be made by a Github pull request, specifically to modify this YAML file. Alternatively, one can create a Github issue for such changes, or simply leave a comment either on the appropriate Erdos problem forum page, or here on this blog.

Many of the sequences do not require advanced mathematical training to compute, and so we hope that this will be a good “citizen mathematics” project that can bring in the broader math-adjacent community to contribute to research-level mathematics problems, by providing experimental data, and potentially locating relevant references or connections that would otherwise be overlooked. This may also be a use case for AI assistance in mathematics through generating code to calculate the sequences in question, although of course one should always stay mindful of potential bugs or hallucinations in any AI-generated code, and find ways to independently verify the output. (But if the AI-generated sequence leads to a match with an existing sequence in the OEIS that is clearly relevant to the problem, then the task has been successfully accomplished, and no AI output needs to be directly incorporated into the database in such cases.)

This is an experimental project, and we may need to adjust the workflow as the project progresses, but we hope that it will be successful and lead to further progress on some fraction of these problems. The comment section of this blog can be used as a general discussion forum for the project, while the github issue page and the erdosproblems.com forum pages can be used for more specialized discussions of specific problems.

Traditionally, mathematics research projects are conducted by a small number (typically one to five) of expert mathematicians, each of which are familiar enough with all aspects of the project that they can verify each other’s contributions. It has been challenging to organize mathematical projects at larger scales, and particularly those that involve contributions from the general public, due to the need to verify all of the contributions; a single error in one component of a mathematical argument could invalidate the entire project. Furthermore, the sophistication of a typical math project is such that it would not be realistic to expect a member of the public, with say an undergraduate level of mathematics education, to contribute in a meaningful way to many such projects.

For related reasons, it is also challenging to incorporate assistance from modern AI tools into a research project, as these tools can “hallucinate” plausible-looking, but nonsensical arguments, which therefore need additional verification before they could be added into the project.

Proof assistant languages, such as Lean, provide a potential way to overcome these obstacles, and allow for large-scale collaborations involving professional mathematicians, the broader public, and/or AI tools to all contribute to a complex project, provided that it can be broken up in a modular fashion into smaller pieces that can be attacked without necessarily understanding all aspects of the project as a whole. Projects to formalize an existing mathematical result (such as the formalization of the recent proof of the PFR conjecture of Marton, discussed in this previous blog post) are currently the main examples of such large-scale collaborations that are enabled via proof assistants. At present, these formalizations are mostly crowdsourced by human contributors (which include both professional mathematicians and interested members of the general public), but there are also some nascent efforts to incorporate more automated tools (either “good old-fashioned” automated theorem provers, or more modern AI-based tools) to assist with the (still quite tedious) task of formalization.

However, I believe that this sort of paradigm can also be used to explore new mathematics, as opposed to formalizing existing mathematics. The online collaborative “Polymath” projects that several people including myself organized in the past are one example of this; but as they did not incorporate proof assistants into the workflow, the contributions had to be managed and verified by the human moderators of the project, which was quite a time-consuming responsibility, and one which limited the ability to scale these projects up further. But I am hoping that the addition of proof assistants will remove this bottleneck.

I am particularly interested in the possibility of using these modern tools to explore a class of many mathematical problems at once, as opposed to the current approach of focusing on only one or two problems at a time. This seems like an inherently modularizable and repetitive task, which could particularly benefit from both crowdsourcing and automated tools, if given the right platform to rigorously coordinate all the contributions; and it is a type of mathematics that previous methods usually could not scale up to (except perhaps over a period of many years, as individual papers slowly explore the class one data point at a time until a reasonable intuition about the class is attained). Among other things, having a large data set of problems to work on could be helpful for benchmarking various automated tools and compare the efficacy of different workflows.

One recent example of such a project was the Busy Beaver Challenge, which showed this July that the fifth Busy Beaver number {BB(5)} was equal to {47176870}. Some older crowdsourced computational projects, such as the Great Internet Mersenne Prime Search (GIMPS), are also somewhat similar in spirit to this type of project (though using more traditional proof of work certificates instead of proof assistants). I would be interested in hearing of any other extant examples of crowdsourced projects exploring a mathematical space, and whether there are lessons from those examples that could be relevant for the project I propose here.

More specifically I would like to propose the following (admittedly artificial) project as a pilot to further test out this paradigm, which was inspired by a MathOverflow question from last year, and discussed somewhat further on my Mastodon account shortly afterwards.

The problem is in the field of universal algebra, and concerns the (medium-scale) exploration of simple equational theories for magmas. A magma is nothing more than a set {G} equipped with a binary operation {\circ: G \times G \rightarrow G}. Initially, no additional axioms on this operation {\circ} are imposed, and as such magmas by themselves are somewhat boring objects. Of course, with additional axioms, such as the identity axiom or the associative axiom, one can get more familiar mathematical objects such as groups, semigroups, or monoids. Here we will be interested in (constant-free) equational axioms, which are axioms of equality involving expressions built from the operation {\circ} and one or more indeterminate variables in {G}. Two familiar examples of such axioms are the commutative axiom

\displaystyle  x \circ y = y \circ x

and the associative axiom

\displaystyle  (x \circ y) \circ z = x \circ (y \circ z),

where {x,y,z} are indeterminate variables in the magma {G}. On the other hand the (left) identity axiom {e \circ x = x} would not be considered an equational axiom here, as it involves a constant {e \in G} (the identity element), which we will not consider here.

To illustrate the project I have in mind, let me first introduce eleven examples of equational axioms for magmas:

  • Equation1: {x=y}
  • Equation2: {x \circ y = z \circ w}
  • Equation3: {x \circ y = x}
  • Equation4: {(x \circ x) \circ y = y \circ x}
  • Equation5: {x \circ (y \circ z) = (w \circ u) \circ v}
  • Equation6: {x \circ y = x \circ z}
  • Equation7: {x \circ y = y \circ x}
  • Equation8: {x \circ (y \circ z) = (x \circ w) \circ u}
  • Equation9: {x \circ (y \circ z) = (x \circ y) \circ w}
  • Equation10: {x \circ (y \circ z) = (x \circ y) \circ z}
  • Equation11: {x = x}
Thus, for instance, Equation7 is the commutative axiom, and Equation10 is the associative axiom. The trivial axiom Equation1 is the strongest, as it forces the magma {G} to have at most one element; at the opposite extreme, the reflexive axiom Equation11 is the weakest, being satisfied by every single magma.

One can then ask which axioms imply which others. For instance, Equation1 implies all the other axioms in this list, which in turn imply Equation11. Equation8 implies Equation9 as a special case, which in turn implies Equation10 as a special case. The full poset of implications can be depicted by the following Hasse diagram:

This in particular answers the MathOverflow question of whether there were equational axioms intermediate between the constant axiom Equation1 and the associative axiom Equation10.

Most of the implications here are quite easy to prove, but there is one non-trivial one, obtained in this answer to a MathOverflow post closely related to the preceding one:

Proposition 1 Equation4 implies Equation7.

Proof: Suppose that {G} obeys Equation4, thus

\displaystyle  (x \circ x) \circ y = y \circ x \ \ \ \ \ (1)

for all {x,y \in G}. Specializing to {y=x \circ x}, we conclude

\displaystyle (x \circ x) \circ (x \circ x) = (x \circ x) \circ x

and hence by another application of (1) we see that {x \circ x} is idempotent:

\displaystyle  (x \circ x) \circ (x \circ x) = x \circ x. \ \ \ \ \ (2)

Now, replacing {x} by {x \circ x} in (1) and then using (2), we see that

\displaystyle  (x \circ x) \circ y = y \circ (x \circ x),

so in particular {x \circ x} commutes with {y \circ y}:

\displaystyle  (x \circ x) \circ (y \circ y) = (y \circ y) \circ (x \circ x). \ \ \ \ \ (3)

Also, from two applications (1) one has

\displaystyle  (x \circ x) \circ (y \circ y) = (y \circ y) \circ x = x \circ y.

Thus (3) simplifies to {x \circ y = y \circ x}, which is Equation7. \Box

A formalization of the above argument in Lean can be found here.

I will remark that the general question of determining whether one set of equational axioms determines another is undecidable; see Theorem 14 of this paper of Perkins. (This is similar in spirit to the more well known undecidability of various word problems.) So, the situation here is somewhat similar to the Busy Beaver Challenge, in that past a certain point of complexity, we would necessarily encounter unsolvable problems; but hopefully there would be interesting problems and phenomena to discover before we reach that threshold.

The above Hasse diagram does not just assert implications between the listed equational axioms; it also asserts non-implications between the axioms. For instance, as seen in the diagram, the commutative axiom Equation7 does not imply the Equation4 axiom

\displaystyle  (x+x)+y = y + x.

To see this, one simply has to produce an example of a magma that obeys the commutative axiom Equation7, but not the Equation4 axiom; but in this case one can simply choose (for instance) the natural numbers {{\bf N}} with the addition operation {x \circ y := x+y}. More generally, the diagram asserts the following non-implications, which (together with the indicated implications) completely describes the poset of implications between the eleven axioms:
  • Equation2 does not imply Equation3.
  • Equation3 does not imply Equation5.
  • Equation3 does not imply Equation7.
  • Equation5 does not imply Equation6.
  • Equation5 does not imply Equation7.
  • Equation6 does not imply Equation7.
  • Equation6 does not imply Equation10.
  • Equation7 does not imply Equation6.
  • Equation7 does not imply Equation10.
  • Equation9 does not imply Equation8.
  • Equation10 does not imply Equation9.
  • Equation10 does not imply Equation6.
The reader is invited to come up with counterexamples that demonstrate some of these implications. The hardest type of counterexamples to find are the ones that show that Equation9 does not imply Equation8: a solution (in Lean) can be found here. I placed proofs in Lean of all the above implications and anti-implications can be found in this github repository file.

As one can see, it is already somewhat tedious to compute the Hasse diagram of just eleven equations. The project I propose is to try to expand this Hasse diagram by a couple orders of magnitude, covering a significantly larger set of equations. The set I propose is the set {{\mathcal E}} of equations that use the magma operation {\circ} at most four times, up to relabeling and the reflexive and symmetric axioms of equality; this includes the eleven equations above, but also many more. How many more? Recall that the Catalan number {C_n} is the number of ways one can form an expression out of {n} applications of a binary operation {\circ} (applied to {n+1} placeholder variables); and, given a string of {m} placeholder variables, the Bell number {B_m} is the number of ways (up to relabeling) to assign names to each of these variables, where some of the placeholders are allowed to be assigned the same name. As a consequence, ignoring symmetry, the number of equations that involve at most four operations is

\displaystyle  \sum_{n,m \geq 0: n+m \leq 4} C_n C_m B_{n+m+2} = 9341.

The number of equations in which the left-hand side and right-hand side are identical is

\displaystyle  \sum_{n=0}^2 C_n B_{n+1} = 1 * 1 + 1 * 2 + 2 * 5 = 13;

these are all equivalent to reflexive axiom (Equation11). The remaining {9328} equations come in pairs by the symmetry of equality, so the total size of {{\mathcal E}} is

\displaystyle  1 + \frac{9328}{2} = 4665.

I have not yet generated the full list of such identities, but presumably this will be straightforward to do in a standard computer language such as Python (I have not tried this, but I imagine some back-and-forth with a modern AI would let one generate most of the required code). [UPDATE, Sep 26: Amir Livne Bar-on has kindly enumerated all the equations, of which there are actually 4694.]

It is not clear to me at all what the geometry of {{\mathcal E}} will look like. Will most equations be incomparable with each other? Will it stratify into layers of “strong” and “weak” axioms? Will there be a lot of equivalent axioms? It might be interesting to record now any speculations as what the structure of this poset, and compare these predictions with the outcome of the project afterwards.

A brute force computation of the poset {{\mathcal E}} would then require {4665 \times (4665-1) = 21757560} comparisons, which looks rather daunting; but of course due to the axioms of a partial order, one could presumably identify the poset by a much smaller number of comparisons. I am thinking that it should be possible to crowdsource the exploration of this poset in the form of submissions to a central repository (such as the github repository I just created) of proofs in Lean of implications or non-implications between various equations, which could be validated in Lean, and also checked against some file recording the current status (true, false, or open) of all the {21757560} comparisons, to avoid redundant effort. Most submissions could be handled automatically, with relatively little human moderation required; and the status of the poset could be updated after each such submission.

I would imagine that there is some “low-hanging fruit” that could establish a large number of implications (or anti-implications) quite easily. For instance, laws such as Equation2 or Equation3 more or less completely describe the binary operation {\circ}, and it should be quite easy to check which of the {4665} laws are implied by either of these two laws. The poset {{\mathcal E}} has a reflection symmetry associated to replacing the binary operator {\circ} by its reflection {\circ^{\mathrm{op}}: (x,y) \mapsto y \circ x}, which in principle cuts down the total work by a factor of about two. Specific examples of magmas, such as the natural numbers with the addition operation, obey some set of equations in {{\mathcal E}} but not others, and so could be used to generate a large number of anti-implications. Some existing automated proving tools for equational logic, such as Prover9 and Mace4 (for obtaining implications and anti-implications respectively), could then be used to handle most of the remaining “easy” cases (though some work may be needed to convert the outputs of such tools into Lean). The remaining “hard” cases could then be targeted by some combination of human contributors and more advanced AI tools.

Perhaps, in analogy with formalization projects, we could have a semi-formal “blueprint” evolving in parallel with the formal Lean component of the project. This way, the project could accept human-written proofs by contributors who do not necessarily have any proficiency in Lean, as well as contributions from automated tools (such as the aforementioned Prover9 and Mace4), whose output is in some other format than Lean. The task of converting these semi-formal proofs into Lean could then be done by other humans or automated tools; in particular I imagine modern AI tools could be particularly valuable for this portion of the workflow. I am not quite sure though if existing blueprint software can scale to handle the large number of individual proofs that would be generated by this project; and as this portion would not be formally verified, a significant amount of human moderation might also be needed here, and this also might not scale properly. Perhaps the semi-formal portion of the project could instead be coordinated on a forum such as this blog, in a similar spirit to past Polymath projects.

It would be nice to be able to integrate such a project with some sort of graph visualization software that can take an incomplete determination of the poset {{\mathcal E}} as input (in which each potential comparison {E \implies E'} in {{\mathcal E}} is marked as either true, false, or open), completes the graph as much as possible using the axioms of partial order, and then presents the partially known poset in a visually appealing way. If anyone knows of such a software package, I would be happy to hear of it in the comments.

Anyway, I would be happy to receive any feedback on this project; in addition to the previous requests, I would be interested in any suggestions for improving the project, as well as gauging whether there is sufficient interest in participating to actually launch it. (I am imagining running it vaguely along the lines of a Polymath project, though perhaps not formally labeled as such.)

UPDATE, Sep 30 2024: The project is up and running (and highly active), with the main page being this Github repository. See also the Lean Zulip chat for some (also very active) discussion on the project.

After some discussion with the applied math research groups here at UCLA (in particular the groups led by Andrea Bertozzi and Deanna Needell), one of the members of these groups, Chris Strohmeier, has produced a proposal for a Polymath project to crowdsource in a single repository (a) a collection of public data sets relating to the COVID-19 pandemic, (b) requests for such data sets, (c) requests for data cleaning of such sets, and (d) submissions of cleaned data sets.  (The proposal can be viewed as a PDF, and is also available on Overleaf).  As mentioned in the proposal, this database would be slightly different in focus than existing data sets such as the COVID-19 data sets hosted on Kaggle, with a focus on producing high quality cleaned data sets.  (Another relevant data set that I am aware of is the SafeGraph aggregated foot traffic data, although this data set, while open, is not quite public as it requires a non-commercial agreement to execute.  Feel free to mention further relevant data sets in the comments.)

This seems like a very interesting and timely proposal to me and I would like to open it up for discussion, for instance by proposing some seed requests for data and data cleaning and to discuss possible platforms that such a repository could be built on.  In the spirit of “building the plane while flying it”, one could begin by creating a basic github repository as a prototype and use the comments in this blog post to handle requests, and then migrate to a more high quality platform once it becomes clear what direction this project might move in.  (For instance one might eventually move beyond data cleaning to more sophisticated types of data analysis.)

UPDATE, Mar 25: a prototype page for such a clearinghouse is now up at this wiki page.

UPDATE, Mar 27: the data cleaning aspect of this project largely duplicates the existing efforts at the United against COVID-19 project, so we are redirecting requests of this type to that project (and specifically to their data discourse page).  The polymath proposal will now refocus on crowdsourcing a list of public data sets relating to the COVID-19 pandemic.

 

The Polymath15 paper “Effective approximation of heat flow evolution of the Riemann {\xi} function, and a new upper bound for the de Bruijn-Newman constant“, submitted to Research in the Mathematical Sciences, has just been uploaded to the arXiv. This paper records the mix of theoretical and computational work needed to improve the upper bound on the de Bruijn-Newman constant {\Lambda}. This constant can be defined as follows. The function

\displaystyle H_0(z) := \frac{1}{8} \xi\left(\frac{1}{2} + \frac{iz}{2}\right),

where {\xi} is the Riemann {\xi} function

\displaystyle \xi(s) := \frac{s(s-1)}{2} \pi^{-s/2} \Gamma\left(\frac{s}{2}\right) \zeta(s)

has a Fourier representation

\displaystyle H_0(z) = \int_0^\infty \Phi(u) \cos(zu)\ du

where {\Phi} is the super-exponentially decaying function

\displaystyle \Phi(u) := \sum_{n=1}^\infty (2\pi^2 n^4 e^{9u} - 3\pi n^2 e^{5u} ) \exp(-\pi n^2 e^{4u} ).

The Riemann hypothesis is equivalent to the claim that all the zeroes of {H_0} are real. De Bruijn introduced (in different notation) the deformations

\displaystyle H_t(z) := \int_0^\infty e^{tu^2} \Phi(u) \cos(zu)\ du

of {H_0}; one can view this as the solution to the backwards heat equation {\partial_t H_t = -\partial_{zz} H_t} starting at {H_0}. From the work of de Bruijn and of Newman, it is known that there exists a real number {\Lambda} – the de Bruijn-Newman constant – such that {H_t} has all zeroes real for {t \geq \Lambda} and has at least one non-real zero for {t < \Lambda}. In particular, the Riemann hypothesis is equivalent to the assertion {\Lambda \leq 0}. Prior to this paper, the best known bounds for this constant were

\displaystyle 0 \leq \Lambda < 1/2

with the lower bound due to Rodgers and myself, and the upper bound due to Ki, Kim, and Lee. One of the main results of the paper is to improve the upper bound to

\displaystyle \Lambda \leq 0.22. \ \ \ \ \ (1)

At a purely numerical level this gets “closer” to proving the Riemann hypothesis, but the methods of proof take as input a finite numerical verification of the Riemann hypothesis up to some given height {T} (in our paper we take {T \sim 3 \times 10^{10}}) and converts this (and some other numerical verification) to an upper bound on {\Lambda} that is of order {O(1/\log T)}. As discussed in the final section of the paper, further improvement of the numerical verification of RH would thus lead to modest improvements in the upper bound on {\Lambda}, although it does not seem likely that our methods could for instance improve the bound to below {0.1} without an infeasible amount of computation.

We now discuss the methods of proof. An existing result of de Bruijn shows that if all the zeroes of {H_{t_0}(z)} lie in the strip {\{ x+iy: |y| \leq y_0\}}, then {\Lambda \leq t_0 + \frac{1}{2} y_0^2}; we will verify this hypothesis with {t_0=y_0=0.2}, thus giving (1). Using the symmetries and the known zero-free regions, it suffices to show that

\displaystyle H_{0.2}(x+iy) \neq 0 \ \ \ \ \ (2)

whenever {x \geq 0} and {0.2 \leq y \leq 1}.

For large {x} (specifically, {x \geq 6 \times 10^{10}}), we use effective numerical approximation to {H_t(x+iy)} to establish (2), as discussed in a bit more detail below. For smaller values of {x}, the existing numerical verification of the Riemann hypothesis (we use the results of Platt) shows that

\displaystyle H_0(x+iy) \neq 0

for {0 \leq x \leq 6 \times 10^{10}} and {0.2 \leq y \leq 1}. The problem though is that this result only controls {H_t} at time {t=0} rather than the desired time {t = 0.2}. To bridge the gap we need to erect a “barrier” that, roughly speaking, verifies that

\displaystyle H_t(x+iy) \neq 0 \ \ \ \ \ (3)

for {0 \leq t \leq 0.2}, {x = 6 \times 10^{10} + O(1)}, and {0.2 \leq y \leq 1}; with a little bit of work this barrier shows that zeroes cannot sneak in from the right of the barrier to the left in order to produce counterexamples to (2) for small {x}.

To enforce this barrier, and to verify (2) for large {x}, we need to approximate {H_t(x+iy)} for positive {t}. Our starting point is the Riemann-Siegel formula, which roughly speaking is of the shape

\displaystyle H_0(x+iy) \approx B_0(x+iy) ( \sum_{n=1}^N \frac{1}{n^{\frac{1+y-ix}{2}}} + \gamma_0(x+iy) \sum_{n=1}^N \frac{n^y}{n^{\frac{1+y+ix}{2}}} )

where {N := \sqrt{x/4\pi}}, {B_0(x+iy)} is an explicit “gamma factor” that decays exponentially in {x}, and {\gamma_0(x+iy)} is a ratio of gamma functions that is roughly of size {(x/4\pi)^{-y/2}}. Deforming this by the heat flow gives rise to an approximation roughly of the form

\displaystyle H_t(x+iy) \approx B_t(x+iy) ( \sum_{n=1}^N \frac{b_n^t}{n^{s_*}} + \gamma_t(x+iy) \sum_{n=1}^N \frac{n^y}{n^{\overline{s_*}}} ) \ \ \ \ \ (4)

where {B_t(x+iy)} and {\gamma_t(x+iy)} are variants of {B_0(x+iy)} and {\gamma_0(x+iy)}, {b_n^t := \exp( \frac{t}{4} \log^2 n )}, and {s_*} is an exponent which is roughly {\frac{1+y-ix}{2} + \frac{t}{4} \log \frac{x}{4\pi}}. In particular, for positive values of {t}, {s_*} increases (logarithmically) as {x} increases, and the two sums in the Riemann-Siegel formula become increasingly convergent (even in the face of the slowly increasing coefficients {b_n^t}). For very large values of {x} (in the range {x \geq \exp(C/t)} for a large absolute constant {C}), the {n=1} terms of both sums dominate, and {H_t(x+iy)} begins to behave in a sinusoidal fashion, with the zeroes “freezing” into an approximate arithmetic progression on the real line much like the zeroes of the sine or cosine functions (we give some asymptotic theorems that formalise this “freezing” effect). This lets one verify (2) for extremely large values of {x} (e.g., {x \geq 10^{12}}). For slightly less large values of {x}, we first multiply the Riemann-Siegel formula by an “Euler product mollifier” to reduce some of the oscillation in the sum and make the series converge better; we also use a technical variant of the triangle inequality to improve the bounds slightly. These are sufficient to establish (2) for moderately large {x} (say {x \geq 6 \times 10^{10}}) with only a modest amount of computational effort (a few seconds after all the optimisations; on my own laptop with very crude code I was able to verify all the computations in a matter of minutes).

The most difficult computational task is the verification of the barrier (3), particularly when {t} is close to zero where the series in (4) converge quite slowly. We first use an Euler product heuristic approximation to {H_t(x+iy)} to decide where to place the barrier in order to make our numerical approximation to {H_t(x+iy)} as large in magnitude as possible (so that we can afford to work with a sparser set of mesh points for the numerical verification). In order to efficiently evaluate the sums in (4) for many different values of {x+iy}, we perform a Taylor expansion of the coefficients to factor the sums as combinations of other sums that do not actually depend on {x} and {y} and so can be re-used for multiple choices of {x+iy} after a one-time computation. At the scales we work in, this computation is still quite feasible (a handful of minutes after software and hardware optimisations); if one assumes larger numerical verifications of RH and lowers {t_0} and {y_0} to optimise the value of {\Lambda} accordingly, one could get down to an upper bound of {\Lambda \leq 0.1} assuming an enormous numerical verification of RH (up to height about {4 \times 10^{21}}) and a very large distributed computing project to perform the other numerical verifications.

This post can serve as the (presumably final) thread for the Polymath15 project (continuing this post), to handle any remaining discussion topics for that project.

This is the eleventh research thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

There are currently two strands of activity.  One is writing up the paper describing the combination of theoretical and numerical results needed to obtain the new bound \Lambda \leq 0.22.  The latest version of the writeup may be found here, in this directory.  The theoretical side of things have mostly been written up; the main remaining tasks to do right now are

  1. giving a more detailed description and illustration of the two major numerical verifications, namely the barrier verification that establishes a zero-free region for H_t(x+iy)=0 for 0 \leq t \leq 0.2, 0.2 \leq y \leq 1, |x - 6 \times 10^{10} - 83952| \leq 0.5, and the Dirichlet series bound that establishes a zero-free region for t = 0.2, 0.2 \leq y \leq 1, x \geq 6 \times 10^{10} + 83952; and
  2. giving more detail on the conditional results assuming more numerical verification of RH.

Meanwhile, several of us have been exploring the behaviour of the zeroes of H_t for negative t; this does not directly lead to any new progress on bounding \Lambda (though there is a good chance that it may simplify the proof of \Lambda \geq 0), but there have been some interesting numerical phenomena uncovered, as summarised in this set of slides.  One phenomenon is that for large negative t, many of the complex zeroes begin to organise themselves near the curves

\displaystyle y = -\frac{t}{2} \log \frac{x}{4\pi n(n+1)} - 1.

(An example of the agreement between the zeroes and these curves may be found here.)  We now have a (heuristic) theoretical explanation for this; we should have an approximation

\displaystyle H_t(x+iy) \approx B_t(x+iy) \sum_{n=1}^\infty \frac{b_n^t}{n^{s_*}}

in this region (where B_t, b_n^t, n^{s_*} are defined in equations (11), (15), (17) of the writeup, and the above curves arise from (an approximation of) those locations where two adjacent terms \frac{b_n^t}{n^{s_*}}, \frac{b_{n+1}^t}{(n+1)^{s_*}} in this series have equal magnitude (with the other terms being of lower order).

However, we only have a partial explanation at present of the interesting behaviour of the real zeroes at negative t, for instance the surviving zeroes at extremely negative values of t appear to lie on the curve where the quantity N is close to a half-integer, where

\displaystyle \tilde x := x + \frac{\pi t}{4}

\displaystyle N := \sqrt{\frac{\tilde x}{4\pi}}

The remaining zeroes exhibit a pattern in (N,u) coordinates that is approximately 1-periodic in N, where

\displaystyle u := \frac{4\pi |t|}{\tilde x}.

A plot of the zeroes in these coordinates (somewhat truncated due to the numerical range) may be found here.

We do not yet have a total explanation of the phenomena seen in this picture.  It appears that we have an approximation

\displaystyle H_t(x) \approx A_t(x) \sum_{n=1}^\infty \exp( -\frac{|t| \log^2(n/N)}{4(1-\frac{iu}{8\pi})} - \frac{1+i\tilde x}{2} \log(n/N) )

where A_t(x) is the non-zero multiplier

\displaystyle A_t(x) := e^{\pi^2 t/64} M_0(\frac{1+i\tilde x}{2}) N^{-\frac{1+i\tilde x}{2}} \sqrt{\frac{\pi}{1-\frac{iu}{8\pi}}}

and

\displaystyle M_0(s) := \frac{1}{8}\frac{s(s-1)}{2}\pi^{-s/2} \sqrt{2\pi} \exp( (\frac{s}{2}-\frac{1}{2}) \log \frac{s}{2} - \frac{s}{2} )

The derivation of this formula may be found in this wiki page.  However our initial attempts to simplify the above approximation further have proven to be somewhat inaccurate numerically (in particular giving an incorrect prediction for the location of zeroes, as seen in this picture).  We are in the process of using numerics to try to resolve the discrepancies (see this page for some code and discussion).

 

This is the tenth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

Most of the progress since the last thread has been on the numerical side, in which the various techniques to numerically establish zero-free regions to the equation H_t(x+iy)=0 have been streamlined, made faster, and extended to larger heights than were previously possible.  The best bound for \Lambda now depends on the height to which one is willing to assume the Riemann hypothesis.  Using the conservative verification up to height (slightly larger than) 3 \times 10^{10}, which has been confirmed by independent work of Platt et al. and Gourdon-Demichel, the best bound remains at \Lambda \leq 0.22.  Using the verification up to height 2.5 \times 10^{12} claimed by Gourdon-Demichel, this improves slightly to \Lambda \leq 0.19, and if one assumes the Riemann hypothesis up to height 5 \times 10^{19} the bound improves to \Lambda \leq 0.11, contingent on a numerical computation that is still underway.   (See the table below the fold for more data of this form.)  This is broadly consistent with the expectation that the bound on \Lambda should be inversely proportional to the logarithm of the height at which the Riemann hypothesis is verified.

As progress seems to have stabilised, it may be time to transition to the writing phase of the Polymath15 project.  (There are still some interesting research questions to pursue, such as numerically investigating the zeroes of H_t for negative values of t, but the writeup does not necessarily have to contain every single direction pursued in the project. If enough additional interesting findings are unearthed then one could always consider writing a second paper, for instance.

Below the fold is the detailed progress report on the numerics by Rudolph Dwars and Kalpesh Muchhal.

Read the rest of this entry »

This is the ninth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

We have now tentatively improved the upper bound of the de Bruijn-Newman constant to {\Lambda \leq 0.22}. Among the technical improvements in our approach, we now are able to use Taylor expansions to efficiently compute the approximation {A+B} to {H_t(x+iy)} for many values of {x,y} in a given region, thus speeding up the computations in the barrier considerably. Also, by using the heuristic that {H_t(x+iy)} behaves somewhat like the partial Euler product {\prod_p (1 - \frac{1}{p^{\frac{1+y-ix}{2}}})^{-1}}, we were able to find a good location to place the barrier in which {H_t(x+iy)} is larger than average, hence easier to keep away from zero.

The main remaining bottleneck is that of computing the Euler mollifier bounds that keep {A+B} bounded away from zero for larger values of {x} beyond the barrier. In going below {0.22} we are beginning to need quite complicated mollifiers with somewhat poor tail behavior; we may be reaching the point where none of our bounds will succeed in keeping {A+B} bounded away from zero, so we may be close to the natural limits of our methods.

Participants are also welcome to add any further summaries of the situation in the comments below.

Just a quick announcement that Dustin Mixon and Aubrey de Grey have just launched the Polymath16 project over at Dustin’s blog.  The main goal of this project is to simplify the recent proof by Aubrey de Grey that the chromatic number of the unit distance graph of the plane is at least 5, thus making progress on the Hadwiger-Nelson problem.  The current proof is computer assisted (in particular it requires one to control the possible 4-colorings of a certain graph with over a thousand vertices), but one of the aims of the project is to reduce the amount of computer assistance needed to verify the proof; already a number of such reductions have been found.  See also this blog post where the polymath project was proposed, as well as the wiki page for the project.  Non-technical discussion of the project will continue at the proposal blog post.

This is the seventh “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

The most recent news is that we appear to have completed the verification that {H_t(x+iy)} is free of zeroes when {t=0.4} and {y \geq 0.4}, which implies that {\Lambda \leq 0.48}. For very large {x} (for instance when the quantity {N := \lfloor \sqrt{\frac{x}{4\pi} + \frac{t}{16}} \rfloor} is at least {300}) this can be done analytically; for medium values of {x} (say when {N} is between {11} and {300}) this can be done by numerically evaluating a fast approximation {A^{eff} + B^{eff}} to {H_t} and using the argument principle in a rectangle; and most recently it appears that we can also handle small values of {x}, in part due to some new, and significantly faster, numerical ways to evaluate {H_t} in this range.

One obvious thing to do now is to experiment with lowering the parameters {t} and {y} and see what happens. However there are two other potential ways to bound {\Lambda} which may also be numerically feasible. One approach is based on trying to exclude zeroes of {H_t(x+iy)=0} in a region of the form {0 \leq t \leq t_0}, {X \leq x \leq X+1} and {y \geq y_0} for some moderately large {X} (this acts as a “barrier” to prevent zeroes from flowing into the region {\{ 0 \leq x \leq X, y \geq y_0 \}} at time {t_0}, assuming that they were not already there at time {0}). This require significantly less numerical verification in the {x} aspect, but more numerical verification in the {t} aspect, so it is not yet clear whether this is a net win.

Another, rather different approach, is to study the evolution of statistics such as {S(t) = \sum_{H_t(x+iy)=0: x,y>0} y e^{-x/X}} over time. One has fairly good control on such quantities at time zero, and their time derivative looks somewhat manageable, so one may be able to still have good control on this quantity at later times {t_0>0}. However for this approach to work, one needs an effective version of the Riemann-von Mangoldt formula for {H_t}, which at present is only available asymptotically (or at time {t=0}). This approach may be able to avoid almost all numerical computation, except for numerical verification of the Riemann hypothesis, for which we can appeal to existing literature.

Participants are also welcome to add any further summaries of the situation in the comments below.

This is the sixth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

The last two threads have been focused primarily on the test problem of showing that {H_t(x+iy) \neq 0} whenever {t = y = 0.4}. We have been able to prove this for most regimes of {x}, or equivalently for most regimes of the natural number parameter {N := \lfloor \sqrt{\frac{x}{4\pi} + \frac{t}{16}} \rfloor}. In many of these regimes, a certain explicit approximation {A^{eff}+B^{eff}} to {H_t} was used, together with a non-zero normalising factor {B^{eff}_0}; see the wiki for definitions. The explicit upper bound

\displaystyle  |H_t - A^{eff} - B^{eff}| \leq E_1 + E_2 + E_3

has been proven for certain explicit expressions {E_1, E_2, E_3} (see here) depending on {x}. In particular, if {x} satisfies the inequality

\displaystyle  |\frac{A^{eff}+B^{eff}}{B^{eff}_0}| > \frac{E_1}{|B^{eff}_0|} + \frac{E_2}{|B^{eff}_0|} + \frac{E_3}{|B^{eff}_0|}

then {H_t(x+iy)} is non-vanishing thanks to the triangle inequality. (In principle we have an even more accurate approximation {A^{eff}+B^{eff}-C^{eff}} available, but it is looking like we will not need it for this test problem at least.)

We have explicit upper bounds on {\frac{E_1}{|B^{eff}_0|}}, {\frac{E_2}{|B^{eff}_0|}}, {\frac{E_3}{|B^{eff}_0|}}; see this wiki page for details. They are tabulated in the range {3 \leq N \leq 2000} here. For {N \geq 2000}, the upper bound {\frac{E_3^*}{|B^{eff}_0|}} for {\frac{E_3}{|B^{eff}_0|}} is monotone decreasing, and is in particular bounded by {1.53 \times 10^{-5}}, while {\frac{E_2}{|B^{eff}_0|}} and {\frac{E_1}{|B^{eff}_0|}} are known to be bounded by {2.9 \times 10^{-7}} and {2.8 \times 10^{-8}} respectively (see here).

Meanwhile, the quantity {|\frac{A^{eff}+B^{eff}}{B^{eff}_0}|} can be lower bounded by

\displaystyle  |\sum_{n=1}^N \frac{b_n}{n^s}| - |\sum_{n=1}^N \frac{a_n}{n^s}|

for certain explicit coefficients {a_n,b_n} and an explicit complex number {s = \sigma + i\tau}. Using the triangle inequality to lower bound this by

\displaystyle  |b_1| - \sum_{n=2}^N \frac{|b_n|}{n^\sigma} - \sum_{n=1}^N \frac{|a_n|}{n^\sigma}

we can obtain a lower bound of {0.18} for {N \geq 2000}, which settles the test problem in this regime. One can get more efficient lower bounds by multiplying both Dirichlet series by a suitable Euler product mollifier; we have found {\prod_{p \leq P} (1 - \frac{b_p}{p^s})} for {P=2,3,5,7} to be good choices to get a variety of further lower bounds depending only on {N}, see this table and this wiki page. Comparing this against our tabulated upper bounds for the error terms we can handle the range {300 \leq N \leq 2000}.

In the range {11 \leq N \leq 300}, we have been able to obtain a suitable lower bound {|\frac{A^{eff}+B^{eff}}{B^{eff}_0}| \geq c} (where {c} exceeds the upper bound for {\frac{E_1}{|B^{eff}_0|} + \frac{E_2}{|B^{eff}_0|} + \frac{E_3}{|B^{eff}_0|}}) by numerically evaluating {|\frac{A^{eff}+B^{eff}}{B^{eff}_0}|} at a mesh of points for each choice of {N}, with the mesh spacing being adaptive and determined by {c} and an upper bound for the derivative of {|\frac{A^{eff}+B^{eff}}{B^{eff}_0}|}; the data is available here.

This leaves the final range {N \leq 10} (roughly corresponding to {x \leq 1600}). Here we can numerically evaluate {H_t(x+iy)} to high accuracy at a fine mesh (see the data here), but to fill in the mesh we need good upper bounds on {H'_t(x+iy)}. It seems that we can get reasonable estimates using some contour shifting from the original definition of {H_t} (see here). We are close to finishing off this remaining region and thus solving the toy problem.

Beyond this, we need to figure out how to show that {H_t(x+iy) \neq 0} for {y > 0.4} as well. General theory lets one do this for {y \geq \sqrt{1-2t} = 0.447\dots}, leaving the region {0.4 < y < 0.448}. The analytic theory that handles {N \geq 2000} and {300 \leq N \leq 2000} should also handle this region; for {N \leq 300} presumably the argument principle will become relevant.

The full argument also needs to be streamlined and organised; right now it sprawls over many wiki pages and github code files. (A very preliminary writeup attempt has begun here). We should also see if there is much hope of extending the methods to push much beyond the bound of {\Lambda \leq 0.48} that we would get from the above calculations. This would also be a good time to start discussing whether to move to the writing phase of the project, or whether there are still fruitful research directions for the project to explore.

Participants are also welcome to add any further summaries of the situation in the comments below.

Archives