@tao@mathstodon.xyz avatar

tao

@tao@mathstodon.xyz

Professor of #Mathematics at the University of California, Los Angeles #UCLA (he/him)

This profile is from a federated server and may be incomplete. Browse more on the original instance.

tao, to random
@tao@mathstodon.xyz avatar

There has been a remarkable breakthrough towards the Riemann hypothesis (though still very far from fully resolving this conjecture) by Guth and Maynard making the first substantial improvement to a classical 1940 bound of Ingham regarding the zeroes of the Riemann zeta function (and more generally, controlling the large values of various Dirichlet series): https://arxiv.org/abs/2405.20552

Let 𝑁(σ,𝑇) denote the number of zeroes of the Riemann zeta function with real part at least σ and imaginary part at most 𝑇 in magnitude. The Riemann hypothesis tells us that 𝑁(σ,𝑇) vanishes for any σ>1/2. We of course can't prove this unconditionally. But as the next best thing, we can prove zero density estimates, which are non-trivial upper bounds on 𝑁(σ,𝑇). It turns out that the value σ=3/4 is a key value. In 1940, Ingham obtained the bound (N(3/4,T) \ll T^{3/5+o(1)}). Over the next eighty years, the only improvement to this bound has been small refinements to the 𝑜(1) error. This has limited us from doing many things in analytic number theory: for instance, to get a good prime number theorem in almost all short intervals of the form ((x,x+x^\theta)), we have long been limited to the range (\theta>1/6), with the main obstacle being the lack of improvement to the Ingham bound. (1/3)

tao,
@tao@mathstodon.xyz avatar

Guth and Maynard have managed to finally improve the Ingham bound, from 3/5=0.6 to 13/25=0.52. This propagates to many corresponding improvements in analytic number theory; for instance, the range for which we can prove a prime number theorem in almost all short intervals now improves from (\theta>1/6=0.166\dots) to (\theta>2/15=0.133\dots). (The Riemann hypothesis would imply that we can cover the full range (\theta>0)).

The arguments are largely Fourier analytic in nature. The first few steps are standard, and many analytic number theorists, including myself, who have attempted to break the Ingham bound, will recognize them; but they do a number of clever and unexpected maneuvers, including controlling a key matrix of phases (n^{it} = e^{it\log n}) by raising it to the sixth (!) power (which on the surface makes it significantly more complicated and intractable); refusing to simplify a certain complicated Fourier integral using stationary phase, and thus conceding a significant amount in the exponents, in order to retain a certain factorized form that ultimately turns out to be more useful than the stationary phase approximation; and dividing into cases depending on whether the locations where the large values of a Dirichlet series occur have small, medium, or large additive energy, and treating each case by a somewhat different argument. Here, the precise form of the phase function (t \log n) that is implicit in a Dirichlet series becomes incredibly important; this is an unexpected way to exploit the special features of the exponential sums arising from analytic number theory, as opposed to the more general exponential sums that one may encounter in harmonic analysis. (2/3)

tao, to random
@tao@mathstodon.xyz avatar

In math research papers (particularly the "good" ones) one often observes a negative correlation between the conceptual difficulty of a component of an argument, and its technical difficulty: the parts that are conceptually routine or straightforward may take many pages of technical computation, whereas the parts that are conceptually interesting (and novel) are actually relatively brief, once all the more routine auxiliary steps (e.g., treatment of various lower order error terms) are stripped away.

I theorize that this is an instance of Berkson's paradox. I found the enclosed graphic from https://brilliant.org/wiki/berksons-paradox to be a good illustration of this paradox. In this (oversimplified) example, a negative correlation is seen between SAT scores and GPA in students admitted to a typical university, even though a positive correlation exists in the broader population, because students with too low of a combined SAT and GPA will get rejected from the university, whilst students with too high a score would typically go to a more prestigious school.

Similarly, mathematicians tend to write their best papers where the combined conceptual and technical difficulty of the steps of the argument is close to the upper bound of what they can handle. So steps that are conceptually and technically easy don't occupy much space in the paper, whereas steps that are both conceptually and technically hard would not have been discovered by the mathematician in the first place. This creates the aforementioned negative correlation.

Often the key to reading a lengthy paper is to first filter out all the technically complicated steps and identify the (often much shorter) conceptual core.

tao,
@tao@mathstodon.xyz avatar

An addendum: good mathematical writers will try to compensate for this negative correlation by trying to highlight the conceptual core of the argument as much as possible, for instance by providing heuristic discussion in the introduction, moving technical computations to appendices (or at least containing them within specific lemmas), or introducing good notation designed to accentuate the conceptual ideas of the paper, and conceal the less interesting technical details as much as possible.

tao, to random
@tao@mathstodon.xyz avatar

Not sure how to feel about elementary arithmetic being classified as journalism: https://www.cnbc.com/2024/05/25/gamestop-surges-after-fetching-933-million-from-stock-sale.html

From the article:

"GameStop made nearly $933.4 million by selling 45 million shares, the struggling videogame retailer said on Friday ... GameStop did not disclose the price at which it sold the shares, but based on Reuters calculations, they were sold at an average price of $20.74 each."

tao, to random
@tao@mathstodon.xyz avatar

Jim Simons, who was a noted differential geometer (for instance being one of the discoverers of Chern-Simons theory), then a successful hedge fund manager, and finally a major philanthropist to mathematics and the sciences, died today, aged 86: https://www.nytimes.com/2024/05/10/business/dealbook/jim-simons-dead.html .

One can debate whether the economic model of first encouraging the concentration of wealth by billionaires, and then relying on such billionaires for philanthropy, is the most effective or just mechanism for creating public goods, but certainly Jim made some very important investments in the modern infrastructure of mathematics and science, with his foundation being a significant funder of the arXiv for instance, and of major institutes such as the SLMath institute (formerly MSRI). I was also fortunate to be supported by a Simons Investigator Award for over a decade, which in turn supported a large number of research activities of my group here at UCLA.

I only interacted with Simons a few times (we were both on the SLMath Board of Trustees, but our interactions were almost entirely via Zoom), but he came across as sincere in his support of the sciences, and I hope the Simons Foundation will continue that support in the future.

tao, to random
@tao@mathstodon.xyz avatar

Kevin Buzzard (@xenaproject) has just launched his five-year project to formalize the proof of Fermat's Last Theorem #FLT in #Lean4: see his blog post at https://leanprover-community.github.io/blog/posts/FLT-announcement/ and the blueprint at https://imperialcollegelondon.github.io/FLT/blueprint/index.html . As discussed in the blog post, the target is to be able to reduce the proof of FLT to "claims which were known to mathematicians by the end of the 1980s". Hopefully this project will develop many of the foundational theories of modern number theory, and also provide real-world lessons about how to organize a genuinely large-scale formalization project, in particular whether the project is sufficiently modular that many people can make meaningful contributions to the project without having to master all the mathematical prerequisites needed to understand the proof of FLT.

tao, to random
@tao@mathstodon.xyz avatar

A couple months ago, another mathematician contacted me and two of my co-authors (Green and Manners) regarding a minor mathematical misprint in one of our papers. Normally this is quite a routine occurrence, but it caused a surprising amount of existential panic on my part because I thought it involved the #PFR paper that I had run a #Lean formalization project in. As it turned out, though, the misprint involved a previous paper, in a portion that was not formalized in Lean. So all was well; we thanked the mathematician and updated the paper accordingly.

But yesterday, we received referee reports for the PFR paper that was formalized in Lean, and one of the referees did actually spot a genuine mathematical typo (specifically, the expression H[A]-H[B] appearing in (A.22) of https://arxiv.org/abs/2311.05762 should be H[A]+H[B]). So this again created a moment of panic - how could Lean have missed this?

After reviewing the Github history for the blueprint, I found that when I transcribed the proof from the paper to blueprint form, I had unwittingly corrected this typo (see Equation (9) of https://teorth.github.io/pfr/blueprint/sect0003.html in the proof of Lemma 3.23) without noticing that the typo was present in the original source paper. This lemma was then formalized by other contributors without difficulty. I don't remember my actual thought process during the transcription, but I imagine it is similar to how when reading in one's native language, one can often autocorrect spelling and grammar errors in the text without even noticing that one is doing so. Still, the experience gives me just a little pause regarding how confident one can be in the 100% correctness of a paper that was formalized...

tao, to ai
@tao@mathstodon.xyz avatar

@TaliaRinger has helped put together a useful list of resources for #AI in #Mathematics, that was initiated during the National Academies workshop on "AI in mathematical reasoning" last year. This list is now publicly available at https://docs.google.com/document/d/1kD7H4E28656ua8jOGZ934nbH2HcBLyxcRgFDduH5iQ0/edit

tao, to random
@tao@mathstodon.xyz avatar

A friend of mine was lucky enough to take this remarkable photo of the recent solar eclipse, in which a plane had left a contrail below the eclipse. She was wondering though why the portion of the contrail that was in the path of the eclipse appeared curved. My initial guess was that it was some sort of refractive effect, but why would the refractive index of the air be so much different inside the shadow of the eclipse than outside of it? With only a few minutes of eclipse I doubt that there would be a significant temperature change. Anyway, any theories would be welcome!

tao, to random
@tao@mathstodon.xyz avatar

Congratulations to Avi Wigderson for receiving the 2023 #TuringAward! https://www.ias.edu/news/avi-wigderson-2023-acm-am-turing-award Avi has made many deep contributions to complexity theory, in no small part by linking it to a staggeringly large number of fields of mathematics (probability, combinatorics, algebraic geometry, linear algebra, operator algebras, etc. ...), and also has a knack of giving extremely accessible lectures (with the slides in his trademark Comic Sans font, which has become almost synonymous with complexity theory presentations in some circles).

As it turns out, Avi gave a talk here in Cambridge, UK (at a conference honoring Timothy Gowers) shortly after the news broke, which gave the chair (Ben Green) a nice opportunity to announce the prize. I believe the video from this talk (which was also excellent, by the way) will be made available shortly at https://www.newton.ac.uk/event/ooew04/ .

tao, to random
@tao@mathstodon.xyz avatar

The most recent issue of the Bulletin of the American Mathematical Society, available at https://www.ams.org/journals/bull/2024-61-02/ , is largely devoted to the topic "will machines change mathematics?", with a wide variety of viewpoints from mathematicians (I understand that the following issue will also contain contributions from other disciplines that can inform the sociology of mathematics.) I found the articles to be an interesting read (though it does make my own forthcoming Notices article at https://terrytao.files.wordpress.com/2024/03/machine-assisted-proof-notices.pdf somewhat redundant, as much of what I write is covered by the union of these articles). It is also notable that many of these articles were submitted nearly a year ago and are thus somewhat overtaken by events - this space is developing quickly!

tao, to mathematics
@tao@mathstodon.xyz avatar

A new (diamond open access) journal devoted to #FormalMathematics has just launched: "Annals of Formalized Mathematics", https://afm.episciences.org/ . (I am not directly involved with the journal, though I am on the #mathematics "epi-committee" of the broader #episciences platform, https://www.episciences.org/ ). There has traditionally not been a natural forum for publishing research-level work on formalizing mathematics, and hopefully this journal will be successful in providing one.

tao, to random
@tao@mathstodon.xyz avatar

Congratulations to Michel #Talagrand for receiving the 2024 #AbelPrize "for his groundbreaking contributions to probability theory and functional analysis, with outstanding applications in mathematical physics and statistics": https://abelprize.no/article/2024/michel-talagrand-awarded-2024-abel-prize

Michel is perhaps less well known outside of probability than he ought to be. I consider myself a user of probability rather than an expert in the subject, but I have always been impressed by the powerful, deep, general, and non-obvious probabilistic tools that he has developed, particularly his concentration inequality https://en.wikipedia.org/wiki/Talagrand%27s_concentration_inequality (which provides concentration of measure estimates in very general settings, without explicitly requiring otherwise standard assumptions such as Gaussian distribution, martingale structure, or Lipschitz dependence), or his majorizing measures theorem https://projecteuclid.org/journals/annals-of-probability/volume-24/issue-3/Majorizing-measures-the-generic-chaining/10.1214/aop/1065725175.full , that gives a remarkably precise (but highly unintuitive) answer to what the expected size of the supremum of a gaussian process is, in terms of the geometry of that process.

tao, to random
@tao@mathstodon.xyz avatar

I've written in the past (see https://terrytao.wordpress.com/advice-on-writing-papers/implicit-notational-conventions/ ) about the useful implicit notational conventions in mathematics, for instance how in analysis the symbol ε is by convention understood to represent a small positive number, which can greatly assist in comprehending analysis arguments that involve manipulating such small constants (and conversely how an argument that uses ε to represent a negative or very large number will cause a lot of unwanted confusion).

In fact, even plain old numerals have a lot of implicit conventions attached to them. Suppose for instance I use the numeral 100 in an estimate, e.g., ( f(n) \ll n^{100}). Just by virtue of 100 being a medium sized round number, I am signaling that the exponent here is some generic constant whose exact value is not worth optimizing for the current argument, or even giving an explicit symbol name such as 𝐶₀; the numeral 100 is intended as a place holder, and related numbers that show up shortly afterwards, such as 101, 200, or 50, will most likely be coming from the choice of the original exponent in the obvious manner. If in contrast I used a non-round number, e.g., ( f(n) \ll n^{76}), the connotations are quite different: readers may assume that the specific value of the exponent is important for the application, and that some attempt has been made at optimizing its value.

It took me a while to realize that this convention is not universally understood, particularly outside the West; I've had students from non-western countries inquire as to where constants such as 100 came from in an argument I wrote, as they could not see how it was related to previous calculations!

tao,
@tao@mathstodon.xyz avatar

@johncarlosbaez I know of one example: Maynard's result that there are infinitely many primes with one missing digit in their decimal expansion. https://arxiv.org/abs/1604.01041

The result works in base 10 and higher. It is open in bases 3 to 9 (and in base 2 it is trivially false if 1 is excluded, and the infamous Mersenne prime conjecture if 0 is excluded).

tao, to random
@tao@mathstodon.xyz avatar

#PCAST has released its report on Strategy for Cyber-Physical Resilience, which focuses on ways to move beyond just traditional cybersecurity (which focuses on preventing cyber attacks), but also on cyber-physical resilience (limiting the impact on a physical system, e.g., a power grid, if a cyber attack gets through). For instance, we recommend critical infrastructure set minimum delivery objectives for what services they can provide even after being attacked, and creating a national critical infrastructure observatory to proactively map out high-impact infrastructure vulnerabilities.

Press release: https://www.whitehouse.gov/pcast/briefing-room/2024/02/27/pcast-releases-report-on-strategy-for-cyber-physical-resilience/

Report: https://www.whitehouse.gov/wp-content/uploads/2024/02/PCAST_Cyber-Physical-Resilience-Report_Feb2024.pdf

tao, to random
@tao@mathstodon.xyz avatar

Suppose an agent (such as an individual, organization, or an AI) needs to choose between two options A and B. One can try to influence this choice by offering incentives ("carrots") or disincentives ("sticks") to try to nudge that agent towards one's preferred choice. For instance, if one prefers that the agent choose A over B, one can offer praise if A is chosen, and criticism if B is chosen. Assuming the agent does not possess contrarian motivations and acts rationally, the probability of the agent selecting your preferred outcome is then monotone in the incentives in the natural fashion: the more one praises the selection of A, or the more one condemns the selection of B, the more likely the agent would select A over B.

However, this monotonicity can break down if there are three or more options: trying to influence an agent to select a desirable (to you) option A by criticizing option B may end up causing the agent to select an even less desirable option C instead. For instance, suppose one wants to improve safety conditions for workers in some industry. One natural approach is to criticise any company in the industry that makes the news due to an workplace accident. This is disincentivizing option B ("Allow workplace accidents to make the news") in the hope of encouraging option A ("Implement policies to make the workplace safer"). However, if this criticism is driven solely by media coverage, it can perversely incentivize companies to pursue option C ("Conceal workplace accidents from making the news"), which many would consider an even worse outcome than option B. (1/3)

tao,
@tao@mathstodon.xyz avatar

In many cases, what one wants to maximize is not the amount of criticism that one applies to an undesirable choice B (or amount of praise one applies to a desirable choice A), but rather the gradient of praise or criticism as a function of the desirability of the choice. For instance, rather than the maximizing magnitude of criticism directed at B, one wants to maximize the increase in criticism that is applied whenever the agent switches to a more undesirable choice B-, as well as the amount of criticism by which is reduced when the agent switches to a more desirable choice B+, where one takes into account all possible alternate choices B-, B+, etc., not just the ideal choice A that one ultimately wishes the agent to move towards. Thus, for instance, a company with a history of concealing all of its workplace accidents, who is considering a policy change to be more transparent about disclosing these accidents as a first step to working to resolve them, should actually be praised to some extent for taking this first step despite it causing more media coverage of its accidents, though of course the praise here should not be greater than the praise for actually reducing the accident rate, which in turn should not be greater than the praise for eliminating accidents altogether. Furthermore, one has to ensure that the direction of the gradient does not point in a direction that is orthogonal to, or even opposite to, the desired goal (e.g., pointing in the direction of preventing media coverage rather than improving workplace safety). (2/3)

tao,
@tao@mathstodon.xyz avatar

In short, arguments of the form "B is undesirable; therefore we should increase punishment for B" and "A is desirable; therefore we should increase rewards for A", while simple, intuitive, and emotionally appealing, in fact rely on a logical fallacy that the the effect of incentives are monotone in the magnitude of the incentive (as opposed to the magnitude and direction of the gradient of incentive). I wonder if there is an existing name for this fallacy: the law of unintended consequences (or the concept of perverse incentives, or Goodhart's law) are certainly related to this fallacy (in that they explain why it is a fallacy), but it would be good to have a standard way of referring to the fallacy itself.

In any case, I think political discourse contains too much discussion of magnitudes of incentives, and nowhere near enough discussion of gradients of incentives. Which raises the meta-question: what incentive structures could one offer to change this situation?
(3/3)

tao,
@tao@mathstodon.xyz avatar

@ftranschel Yes, the incentive gradient is not the only important factor here; the geometry of the global incentive structure is really the thing one wants to discuss and shape. But shifting the discourse to gradients is at least a step in the right direction compared to the status quo of focusing on magnitudes (and, in the spirit of this discussion, it is incremental improvements in the right direction that should be relatively rewarded, even if they still fall short of the perfect outcome).

With this viewpoint, capitalist market economies can be viewed as systems that are solely devoted to optimising incentives through gradient descent, with price discovery mechanisms basically being a means to compute the gradient. They still contain flaws: they can get stuck at local minima, and there is a perverse incentive to sacrifice non-monetary goods (such as ethical conduct or reputation) in favor of monetized goods. While still preferable to a purely emotionally driven incentive system, ultimately one wants to transition to a system where capitalist economics plays some role (e.g., to compute gradients via market mechanisms), but policy is also guided by more global, long-term considerations.

tao, to random
@tao@mathstodon.xyz avatar

Another wonderfully entertaining lecture from my former classmate from graduate school, Tadashi Tokieda, this time on how one can use the laws of physics to derive many well known results in pure mathematics, such as the Pythagorean theorem, the infinitude of primes, and many other examples. https://www.youtube.com/watch?v=vNzmj6ryulI

tao, to random
@tao@mathstodon.xyz avatar

A new #Lean formalization project led by Alex Kontorovich and myself has just been announced to formalize the proof of the prime number theorem, as well as much of the attendant supporting machinery in complex analysis and analytic number theory, with the plan to then go onward and establish further results such as the Chebotarev density theorem. The repository for the project (including the blueprint) is at https://github.com/AlexKontorovich/PrimeNumberTheoremAnd , and discussion will take place at this Zulip stream: https://leanprover.zulipchat.com/#narrow/stream/423402-PrimeNumberTheorem.2B

tao, to random
@tao@mathstodon.xyz avatar

One can tie a digital identity to a pseudonym rather than to one's real-life identity, that one uses either temporarily or over a longer term basis, though in the latter case there will indeed be a risk that this identity can be connected back to you personally. One can of course refuse to use any of these technologies and only create anonymous, unauthenticated content, but it will be increasingly hard to distinguish such content from AI-generated spam in the future.

tao, to ai
@tao@mathstodon.xyz avatar

The ability of #AI tools to readily generate highly convincing "#deepfake" text, audio, images, and (soon) video is, arguably, one of the greatest near-term concerns about this emerging technology. Fundamental to any proposal to address this issue is the ability to accurately distinguish "deepfake" content from "genuine" content. Broadly speaking, there are two sides to this ability:

  • Reducing false positives. That is, reducing the number of times someone mistakes a deepfake for the genuine article. Technologies to do so include watermarking of human and AI content, and digital forensics.

  • Reducing false negatives. That is, reducing the number of times one believes content that is actually genuine content to be a deepfake. There are cryptographic protocols to help achieve this, such as digital signatures and other provenance authentication technology.

Much of the current debate about deepfakes has focused on the first aim (reducing false positives), where the technology is quite weak (AI, by design, is very good at training itself to pass any given metric of inauthenticity, as per Goodhart's law); also, measures to address the first aim often come at the expense of the second. However, the second aim is at least as important, and arguably much more technically and socially feasible, with the adoption of cryptographically secure provenance standards. One such promising standard is the C2PA standard https://c2pa.org/ that is already adopted by several major media and technology companies (though, crucially, social media companies will also need to buy into such a standard and implement it by default to users for it to be truly effective).

tao,
@tao@mathstodon.xyz avatar

@dpwiz Badly designed cryptosystems can be broken in a number of ways, but well designed ones, particularly ones with a transparent implementation and selection process, are orders of magnitude more secure. Breaking SHA-2 for instance - which the C2PA protocol uses currently - would not simply require state-level computational resources, but a genuine mathematical breakthrough in cryptography.

Perhaps ironically, reaching the conclusion "all cryptosystems can be easily broken" from historical examples of weak cryptosystems falling to attacks, is another example of eliminating false positives (trusting a cryptosystem that is weak) at the expense of increasing false negatives (distrusting a cryptosystem that is strong).

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • thenastyranch
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • ngwrru68w68
  • provamag3
  • magazineikmin
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • JUstTest
  • All magazines