not-that-verifiable

November 21st, 2025

Verifiable games look easy to learn

In the recent years, there has been a growing consensus that certain tasks can be verified in closed training loop, and notably:

•

Programming tasks: if an LLM writes a program, it can be interpreted and hence yield a verifiable output

•

Mathematics tasks: if an LLM writes the proof of a statement in e.g. Lean, that proof can be mechanically checked by the proof verifier

Of course, the general isomorphism between programs and proofs makes these two examples in some sense the same one (though the relevant tasks for either field are vastly different)

Verifiable -> Powerful Training Loop

The programmatically verifiable nature brings of course a lot of things in line, and solves many problems associated with the scarcity and the quality of data about the rest of the world: with a bit of creativity/cleverness, we can run an infinite training loop that keeps bringing stuff and reliably see models improve in measurable ways

As a math professor I knew would say (with a heavy Swiss-German accent) about the plethora of average-quality math results: "yeah, yeah, and there are also two very large numbers that have never yet been multiplied by each other!"

There is a deeply satisfying superficial sense of "this problem is perfectly circumscribed" and we can get super-human at it like machines have become super-human at games like go/chess

This essay is about the fact that there is substantial difference between math/programming and go/chess which makes any naive training loop based on reinforcement learning bound to give in fact ultimately disappointing results

Loop -> super-human coders/mathematicians?

The reason is hidden objectives: even though the only goal of go/chess play is victory, the job of a mathematician is not only to provide valid mathematics proofs and the job of a (good) programmer is not only to write code that executes correctly (note that guaranteeing this alone is much harder than verifying a theorem's proof)

That these are not really the jobs of mathematicians/programmers are obvious to the practitioners (I am not sure how people who don't practice these jobs view them)... yet it may not be obvious what the objectives are, or that they may be implemented in a closed RL loop

For instance, one could say that a mathematician's job is to do work that gets their colleagues excited, or that other scientists find useful or insightful

Similarly, one could say that the goal of good code is to be maintainable, re-usable, didactic, to other humans or to other models

It would be fair to say that at minima mathematicians should produce valid (formal or informal) proofs of mathematical statements, and that programmers should produce code that runs properly, but that within these constraints, there are numerous additional nontrivial objectives

Commonly Accepted Additional Math Objectives

For mathematics, a commonly accepted and well-defined objective (that currently is crucial to a mathematician's careers) is to find the solution of already stated important (which often mean old and famous) unsolved problems, or the solution to a problem that seems in line with such 'famous problems'

This naturally limits triviality, and gives a meaning to the concept of progress, though the meta-problem of "important problem" creation is eluded

Commonly Accepted Additional Code Objectives

This still can lead to fairly undesirable outcomes, with extremely long and technical solutions to problems that no one reads, and leave us wonder who read the proof and got something from it (a reasonably widely-spread perception in math), and also it only depletes the pool of important problems without feeding anything

It is clear that if some programming project comes with documentation, tests, use-cases, is easy to read and maintenable, then it is good code (relatively to the problem that one is trying to solve), and this can in principled be modeled if one knows what these modifications are likely to be

In essence, a piece of software should be useful for a given task that some entity may care about (it makes a difference for the world that this piece of software exists)

Often the importance of important code is the formulation of the problem that is being solved (e.g. the code of the Bitcoin protocol is perhaps not very surprising given a framing of the problem, but the very nontrivial part of it is the problem being solved)

This brings us to the question of 'important code'; maintenability should be evaluated in that light: how much should a piece of code be modified if we changed slightly its purpose towards a different important purpose?

So, it all comes down to people, doesn't it?

In some sense, I think a first instinct about all of these things is that closed RL loops without a connection to the real world are doomed to only give subpar results

And it is only within that framework that the quality of a piece of code can be appreciated: how much the code can get to do some other useful task easily

Tools like Cursor generate fine-looking and working code, but they create technical debt and are very lacking in many ways in terms of what really good humans could produce for humans: an example of these days is Karpathy's nanochat, which is insightful, elegantly written, and pedagogical, in a way that could not be created by an LLM, in spite of the fact that an LLM can (probably) generate a code that executes the same way

In some sense, what we see now with tools like GPT-5 Pro is that they can solve math problems, but rarely bring us results that we are excited about

A (boring?) simple and straightforward answer is that humans decide what is important to them and that this is the additional metric that cannot be included in a closed RL loop

... then there is the question of which humans matter, and then it becomes a question of social validation, which means a community (and so to some extent consensus and money)

It all comes down to consensus, doesn't it?

In some sense, social consensus (or money, to simplify) is a closed-loop answer in itself (at the scale of the economy), that is self-sufficient to its own vision (though some people would find this a distateful take)... which is a shame because the economy is notoriously impossible to understand mathematically, so that no RL closed loop seems possibly applicable to reproduce it

This is probably true in fine (until the day economy becomes dominated by AI agents, at least), but thinking that no RL loop can do better at approximating what we would expect gains social consensus is too pessimistic

For instance, we could imagine that an agent writes a piece of code so that an agent can re-use this code with minimal modifications to make something else, and gamify this

Or one could seek from an agent that it creates a concise piece of math that allows (conterfactually) an agent who has seen it to solve some natural problems that a version of the agent who has not seen it cannot solve it

So... this is how we get much closer to xent games, but more about this in a future post!

The lists below are by no means exhaustive, but they are meant to give an idea of the challenges

In pure mathematics, it often is the case that the context and the time when a problem was raised really matter for "importance": for instance, if the problem naturally arises within the solution to an important problem, or is asked by someone who solved an important problem, it becomes more "important"

These are examples of pieces of code and math I would love to see!