October 14th, 2025
October 14th, 2025
One of the central ideas underlying the construction of xent games is that of transfer between games and to learn from each game a little bit that is relevant to learn other games
One of the central ideas underlying the construction of xent games is that of transfer between games and to learn from each game a little bit that is relevant to learn other games
A key element of design is that the xent constraints are positivity constraints for some sum of difference of xents of things
A key element of design is that the xent constraints are positivity constraints for some sum of difference of xents of things
ensure(xent(t)<xent(""Hello, what's up?"))
ensure(xent(t)<xent(""Hello, what's up?"))
For instance, a xent constraint is always one of the form
For instance, a xent constraint is always one of the form
And the important thing is that we can learn to fulfill such constraints by learning to play the corresponding game to maximize them... basically if we need to stay afloat (i.e. to fulfill the positivity), we can learn to play the game aimed at maximizing that positivity, and this will teach us to stay afloat
And the important thing is that we can learn to fulfill such constraints by learning to play the corresponding game to maximize them... basically if we need to stay afloat (i.e. to fulfill the positivity), we can learn to play the game aimed at maximizing that positivity, and this will teach us to stay afloat
Now, the point is that every game is just made of moves with constraints that yield scores, and so we can learn to play by the rules progressively, and we can learn more and more sophisticated games, or add constraints to make them richer
Now, the point is that every game is just made of moves with constraints that yield scores, and so we can learn to play by the rules progressively, and we can learn more and more sophisticated games, or add constraints to make them richer
Obviously, there is a huge difference between "making a very valid chess move" and "playing chess well", but any understanding of the latter has to be grounded in confidence about the former (it is crucial that one can reliably think about valid chess moves to play chess)
Obviously, there is a huge difference between "making a very valid chess move" and "playing chess well", but any understanding of the latter has to be grounded in confidence about the former (it is crucial that one can reliably think about valid chess moves to play chess)
For chess, for instance (simplifying somehow), given a strong enough judge model, we can write
For chess, for instance (simplifying somehow), given a strong enough judge model, we can write
Where $xent$ is evaluated with respect to a judge model (the model upon which all games are based)
Where
xent is evaluated with respect to a judge model (the model upon which all games are based)
ensure(xent(s + "is a valid chess move? yes")
< xent(s + "is a valid chess move? no"))
Going back to the chess, related games that could be somehow useful to learn are games starting from different positions (e.g. replay famous games from some point), or chess problems (e.g. mate in 3), or chess problem design (e.g. construct a chess position where there is a mate in 3 that your opponent can't solve)
Going back to the chess, related games that could be somehow useful to learn are games starting from different positions (e.g. replay famous games from some point), or chess problems (e.g. mate in 3), or chess problem design (e.g. construct a chess position where there is a mate in 3 that your opponent can't solve)
Isn't that a lot of games?
Isn't that a lot of games?
Of course, the space of games is enormous, but a key point is that we can _navigate it_ thanks to the transfer value
Of course, the space of games is enormous, but a key point is that we can navigate it thanks to the transfer value
The transfer value gives us a compass to say that we are doing the right thing, in that we are learning _something useful_ relative to _what we consider already useful_, and it hence allows us to _open ourselves to new things_:
The transfer value gives us a compass to say that we are doing the right thing, in that we are learning something useful relative to what we consider already useful, and it hence allows us to open ourselves to new things:
If there is a game that allows us to learn the things that we care about, but is at the same time compact, and seems to not be directly learnable from the games we already know, we would be incentivized to add it to our collection of games
If there is a game that allows us to learn the things that we care about, but is at the same time compact, and seems to not be directly learnable from the games we already know, we would be incentivized to add it to our collection of games
So, what about game interpolation?
So, what about game interpolation?
When facing a very difficult specific game (e.g. solving a math conjecture, or mastering a multi-player game with an enormous action space), an advanced agent ought to construct a curriculum of games to master, that will lead it to make progress
When facing a very difficult specific game (e.g. solving a math conjecture, or mastering a multi-player game with an enormous action space), an advanced agent ought to construct a curriculum of games to master, that will lead it to make progress
The ability to decompose things into relevant games is key to mastering advanced games, and that is why game interpolation and modularity is important
The ability to decompose things into relevant games is key to mastering advanced games, and that is why game interpolation and modularity is important
.