promathematheon

Incipit discvrsvs pro mathema theon · The first draft of this essay was written 8th May 2025 – 14th May 2025 in Cambridge, UK. The coda on culture and the humane application of artificial intelligence (Beginning after “A Prospectus, Again”) written 14th May 2025 – 16th May 2025. This edition was prepared for publication on the Internet on the 29th of June, 2025. Minor corrections have been made and will be made as corrections are presented to the author.

Abstract

We think of optimisation as a powerful and scary force. Some think it almost demonic, waiting in the shadows to grind our values and bodies to dust. Some attribute it to capitalism, to evolution, or to the human desire for perfectionism. In fact, it is the default way anything comes into existence at all¹. In this work I go through the layers of optimisation and evolution that led to our current existence. I relate this to how we are currently constructing a further layer of intelligent optimisation, which we call AI. I consider the implications for safe AI systems from this perspective, as well as the common fear that the AI systems we make will be alien because of the optimisations forces we place on them. Finally, I consider the causes behind society’s current failure to make sense of itself, how AI is already intervening, and how it might be used to help rather than harm.

Introduction

The following is my attempt to make some comprehensive record of my intellectual perspective, and also a justification for the current path of research I am undertaking. It contains a series of sketches of various phenomena² which I find united by similar mechanisms. The title of this work is to one day be the “Mathematheon”, a neologism. The stem comes from the Greek “mathema”, meaning knowledge, the result of learning, or mathematics. The suffix “-theon” is also Greek, and refers to the gods. I want to describe the knowledge one might gain from a top down, gods’ eye view of reality—and what I hope to do about it, in relation to the development of powerful AI systems in our present. What phenomena permeates through time? What high level patterns emerge from chaos? What are we accelerating towards, waiting for us in the cold dark place we call the future?

Being undisciplined in thought and reason, this is the work that must precede those powerful realisations. Hence it is the promathematheon, as a prologue is to a novel.

A Declaration of Ignoramus

As with any work written in youth and in haste many errors have been introduced, both from failures in understanding and also from simple ignorance. For example, it was only after I completed this exercise that I discovered the work of Peter Putnam and began to read Max Bennett's A Brief History of Intelligence. Together they offer an elegant and powerful answer to my vague musings about cognitive development. Thus also with philosophy, and economics, and any other discipline which I have lightly touched upon here.

There are many others whose subject-specific works I find far more elegant than mine. Some of their teachings I acknowledge through epigraphs. Far more I cannot thanks to forgetfulness and, again, simple ignorance. Nevertheless, I am determined put this text into a mostly-fixed form now. If I don't, it will never be finished, and I think that in the total picture of this essay there is merit that somewhat makes up for the grave technical errors in any one corner.

So I ask you, reader, to treat my words as a first stab and a preliminary treatment. Nothing here merits the status of fact, or even that of consistent theory. This essay should be seen as a collection of metaphors which happened to inspire me. With time, I hope to do these metaphors justice and complete my learning. Still, I hope there is something here for you to enjoy and be inspired by as well. I am eternally grateful to you for taking the time to read my words and share in my excitement.

A beginning is a very delicate time

For a moment, try to see the universe as Boethius’ God does. Here is a method: let us say that the universe is the size of a rubik’s cube. Now imagine a great row of such universe-cubes laid out in front of you. The leftmost cube is t=0, the one next to it on the right hand side t=1, the one to the right of that t=2, and so on until infinity.

Let us start with an empty universe with no matter of any sort in it. A purely blank volume of space, making our cubes small and perfectly dark. In this empty universe the cube “t=0” would be exactly the same as cube “t=143”, and the same as cube “t=10000”. Effectively, the entire timeline consists of copies of the t=0 cube.

Now let us imagine a new line (timeline #2) of universe-cubes. This world is different: There are particles within this universe that move randomly under Brownian motion. Now t=42 is no longer the same as t=666. But still, it is hard to distinguish one universe-cube from another.

Now suppose that at t=5000 some particles in timeline #2 form into a teapot shape and hang together for another while until t=10000. Now, all of a sudden, the timeline is split into the pre-teapot, teapot, and post-teapot eras. Intuitively, we can say that something is happening in the 5000-10000 range that is meaningfully different from the ranges before and after it. The teapot shape sticks out as we scan across the cubes. Indeed, we can define a thing as a physical phenomenon that holds together and maintains its shape over time. Put another way, time is a filter that distinguishes things from not-things, the long-lasting from the merely ephemeral. Tempus, edax rerum³.

The next insight we can discover is that, merely by moving rightward along timeline #2, we can engage in a kind of play or combinatorial exploration. The particles in each universe-cube, moving freely and bumping into each other, naturally “try out” many thousands or millions of combinations as to how they might fit together. Individually, the chance of any combination having a persistent structure that survives the test of time is almost zero. Still, over a long enough period of time some structure will be produced that has the ability to hold itself together—perhaps even one shaped like a teapot. And, if the chance of randomly assembling some persistent thing is not too dismal, and the structures so produced are not too fragile, eventually things may last long enough to interact with other things. Once they do, these random collisions become themselves reasons for things to fall apart. In a world with multiple things, time becomes a form of selection pressure: Those that can persist, will persist. Those that cannot return to the dust from whence they came.

Life finds a way

Thus far we have only considered one of the ways to survive the test of time, which is robustness. In the world of timeline #2 merely holding yourself together and surviving the occasional bump from another object is considered a victory. But what if there was a more powerful way of fighting time, the eater of all things? Life can be considered as two types of defence against impermanence and falling apart. The first is reproduction, and the second is adaptation. I consider reproduction first in this section⁴.

Let us move from the universe-cubes to our own world, before the first forms of life arose. The environment is much more friendly to complex objects appearing than timeline 2. Our earth is filled with active and reactive compounds circulating in the atmosphere and the oceans, with ample heat to serve as binding and activation energy. A soup of chaos, teeming with potential. At the same time, any new thing-to-be will collide with something—probably violently—much sooner than it would in the cold void of timeline 2. In such a world, what persists the best?

To prevent a series of pedagogical examples extending the rest of the text indefinitely, I now move to consolidate the question we have been informally considering in formal terms. Let us grant that there is a wide space of possible combinations for, let us say, carbon compounds. Then we can naturally create a coordinate plane in which each point is assigned a code representing some combination of atoms. Points close to each other would have similar configurations of atoms, as defined by a string metric like edit distance. Now to this plane we add a y-axis, which is the average survival time of such a compound in the primordial soup (we do not consider single atoms, which by nature have a vastly longer survival time than any compound). Normally, then, we say that any compound that forms is by accident, and will probably lose parts over time through collisions until it fully disintegrates, with weaker bonds breaking first. Yet there are also more interesting peaks in this space, local maxima with complex structures that might be able to withstand a few knocks. And of these interesting and structured things, those that can withstand some degree of deformation without breaking will make much better use of their molecular budget than those that are merely robust appellations. All things being equal, a spine-like flexible structure floating in a chaotic soup will survive for much longer against random shocks than a unyielding stick of a similar size and base material.

Now we can apply the sieve of time-the-eater and get at least some interesting things that persist for some time. But we are still stuck with a painfully slow process. After all, every time we make an interesting thing we have to essentially start from scratch, making every new thing by randomly bumping parts together like we’re still stuck in timeline 2. Sure, we now have many more interesting base parts to work with, and can put them together with much more frequency and force, but it is still a time consuming and effectively random process.

Now suppose that, somewhere in this coordinate space of possible compounds, there is a creator-thing that, when it encounters sufficient materials, can “fabricate” another thing. Say that for example it is like an enzyme, which takes in “pieces” which are other molecules and allows them to combine into a “product” compound with less activation energy required. Such a thing would, while it exists, make the creation of the product compound more likely in our random soup of chaos.

But we can go further. Suppose that the product of our creator-thing is a part of the creator-thing, or even (if the creator-thing is sufficiently complex) a complete copy of the creator-thing. If this seems unlikely, remember that our space contains all possible compounds of atoms, and that we have quite a long time to work with.

What would the effects of this creator-thing be? Once we create such a compound via random combination, it quickly gets to work making the creation of itself more likely. If we are unlucky, our replicating candidate returns to the soup from whence it came. If we are lucky, we get two, four, eight, sixteen copies of the thing, all ready to make more copies. An explosion is underway: We have, by pure chance, found one of the key strategies to surviving time-the-eater. In fact, if we allow time to fast-forward, the chance that we land upon one such compound increases slowly but surely, and once it is born even once the number of such things in the local environment increases drastically. Time, which so solemnly and ruthlessly eliminates the transient, makes inevitable the emergence of the persistent⁵.

Wheels within Wheels and Sieves within Sieves

So we have seen how time-the-eater, placed against a backdrop of random chaos, selects first for things which persist in time and then for things which make the creation of similar things more likely. We now turn to the nature of the creator-things which time-the-eater has favoured. What forms will these creator-things take?

It is at this point that time-the-eater assumes their more customary name: evolution. Up to now I have resisted calling time-the-eater “evolution” thanks to our very set view of evolution as a phenomenon, as almost a culling or selecting farmer who produces creatures it deems the “most fit” with regards to some environment. Of course evolution is no such thing, because evolution is not a thing at all. It does not meaningfully exist or operate in the sense we are used to, it is rather a non-thing that emerges only in the penumbra between creator-things and their offspring.

When creator-things encounter random shocks served up by their environment, those creator-things better equipped to weather those shocks are allowed to continue replicating and creating. Combined with the random soup-of-chaos process which produces variants of creator-things and shocks that alter creator-things by chance as they live and make copies of themselves, we now have all three parts of the standard description of evolution: variation, replication, and selection.

Yet, as I have taken pains to highlight up till now, none of these forces are in any sense bound together as a tight, thing-like package—instead we might understand them as a sort of useful fiction, like that of centrifugal force. If the environment does not serve up random shocks, evolutionary selection is not in action⁶. If the random shocks delivered do not alter the creator-things, then there will be no variation unless an entirely new form of life arises ex nihilo. And if the creator-things are never produced by chance to begin with, they will never live to replicate and begin the entire process. But also because this fictitious force is not actually originating from any material substrate, “evolution” is always working so long as those three elements are in play, no matter what the creator-things are or what the environment is⁷.

Cifra, Ergo Sum

Of life’s secret weapons—replication and adaptation—we have considered replication. Now we apply a similar treatment again to the matter of whether creator-things can adapt to its environment, thereby (eventually) demonstrating some form of intelligence. Let us not begin by setting any firm definition of “intelligence”, which would force the conclusion one way or another based on the specifics of the definition. Rather let us consider how adaptive behaviour might arise from the processes we have described, which involve variation, reproduction, and selection; the harnessing of chaos in the service of evading time-the-eater.

Let us begin with the simplest case, the design of a physical mechanism that has some adaptive function. If we consider our enzyme-like thing, we might say that we want some kind of filter which blocks out molecules which are the wrong shape from making use of the binding site. Using our previously established creative process, this would take a very long amount of time. A creator-thing replicates parts of itself or complete copies of itself, which is how we escape the need for parthenogenesis⁸ over and over again out of the soup-of-chaos. And since the vast majority of possible structures are not self-replicating, any random environmental shock to this process probably makes any resultant creator-thing worse at replicating itself, and probably does not give it any superior performance compared to the unmodified creator-thing. There are many ways down a craggy mountain, and few ways to climb it safely.

However, there is a trick by which we can get around the need for directly modifying creator-things to pass down complex notifications. Suppose that there are many different enzyme-parts floating around in the soup-of-chaos (produced, of course, by other enzyme-like creator-things). If there is a way to encode which parts a creator-thing can or should combine (for example, by use of chemical signals that open or close certain binding sites), then it becomes much easier to experiment with different combinations of parts. This code does not need to specify the complete creator-thing, merely which of some interchangeable or similar parts should be involved in the replication process.

From an evolutionary perspective, a partial or complete encoding also makes it easier to replicate creator-things, since it effectively acts as a “summary” of the unique characteristics of some given creator-thing: in theory, given a complete encoding of a creator-thing all that needs to survive is the code and the assembler that executes the code—the rest of the creator-thing can be destroyed and no harm will come to future creator-things. In practice, of course, no encoding is perfect, and some reliance on environmental information and preconditions is always needed.

Encoding is a key part of the development of what we call life, because it enables both more reliable replication and more flexibility in how replication happens, leading to vastly more efficient adaptation. While it is true that most shocks to DNA produce neutral or harmful encodings, it is orders of magnitude easier to produce a beneficial encoding by mutating DNA than by, for example, hitting an amoeba until it develops some beneficial feature from repeated blunt-force trauma. To be direct with my metaphor, we can imagine a similar possibility space to that of the atomic compounds in timeline #2, where the y-axis is the average time until destruction (death) without successful replication and the points on the plane are string encodings of the structure of creator-things.

Effectively, encodings form a second layer of order on top of the first layer (creator-things) and the zeroth layer (things in general), since they in themselves have no particular significance but when combined with a compatible creator-thing can be used to produce creator-things of infinite varieties. Similarly, creator-things in a void are useless but in an environment with the correct atomic resources they can make more copies of themselves. In each case out of a base environment of seeming endless chaos a higher order is created, with no guiding will except that of time-the-eater.

The Origin of the Mind

Now that we have encodings (partial or complete) to work with and mutate, the work of life can speed up immensely. No longer are we bound by the sieve of time-the-eater and the brownian motion of timeline #2. Now the bounds are how fast a given creator-thing can replicate, the amount of resources with which to replicate, the selection pressures of the creator-thing’s environment, and the mutations a creator-thing can make to adapt to their environment. Just as things can bump into each other and give each other random shocks in the void of timeline #2, our creator-things now replicate and compete with each other using the limited resources in our proto-Earth. There’s only so much carbon or phosphorus to go around, after all.

In this new, yet more dynamic, and yet more competitive environment, the use of DNA and similar encodings to try out different combinations becomes inefficient. You can at most make a small number of changes for every generation of a living thing stemming from random mutations. Most mutations are harmful or useless. Therefore, there is a “speed limit” on how fast organisms can adapt. It would be much better if organisms had some innate capacity for fast adaptation, a flexible dimension that allowed them to test out new ways of adapting to and processing the environment around them, all without making them wait until the next generation.

To be clear, much has already been achieved even without this next breakthrough. Bacteria, for example, can move towards food (and away from toxins), thereby collecting the resources needed to replicate while avoiding negative shocks. What I am describing is some method that allows an entity to survive sudden changes to their environment, such as a fire or rainstorm. It might also help them decisively overpower a rival species and secure access to some resources. For this you need something that works faster than reproduction but also something that has some time-sensitivity: DNA and encodings treat the environment as a relative constant to be optimised against, they never “forget” any beneficial adaptation that is not actively selected against. We will need a more nuanced understanding of what elements in an environment are stable or temporary to get to the next level of adaptivity and survival⁹.

From this perspective, the development of neural circuits, entanglements, and eventually brains becomes almost inevitable¹⁰. We can imagine a brain as a canvas upon which temporary adaptations can be tested out and, if beneficial, made persistent (in other words, learned). Again, the base from which these temporary adaptations are constructed is chaos, stimuli passed through neurons that fire reflexively in tangles. So long as the brain is equipped with feedback signals to indicate reward or punishment, however, we can create a third level of order on top of level two (genes and encoding), level one (creator-things), and level zero (creator-things). This complex level of manufacturing is only possible thanks to the enhanced variational flexibility our level two encodings gave us. Can you imagine, from merely unmotivated random dust and gas collisions, a complete brain forming in timeline #2¹¹?

Learning, Deep and Shallow

Now we come to the intersection of this work (which thus far has mostly been a strange alternative narrative of abiogenesis and evolution) with my regular occupation, that of the mechanisms of intelligence and learning. We begin by reinforcing that the learning we discuss here is merely another layer of adaptation layered above the rest: it fulfills the function of “assisting a thing maintain itself in an environment”, and no more¹². We are not yet speaking of conscious minds and great edifices of learning.

So, how does learning work? You have already seen something like it in action twice now. Consider a learned behaviour a program¹³ with some string description. Now consider a coordinate plane whose points correspond to all possible learned programs, with points close by denoting similar programs via edit distance. And now, again, take the y-axis as the success level i.e. the reward administered by the organism’s internal sensing-organs¹⁴.

Yet a difficulty now presents itself. Programs, unlike the atomic compounds we started with, are quite specific in their construction. A single character difference could produce very different behaviour and probably break most programs. This was a benefit when we were considering variations in encodings, since it enabled variations to arise more efficiently across generations. Over the lifetime of a single organism, however, randomly updating its behaviour drastically at the time scale of seconds or minutes probably does not end well. Furthermore, the discrete nature of programs means that we cannot administer partial rewards for partially correct mutations. There is no such thing as a quarter of a symbol overwrite operation, or a fifth of a print statement. (On the other hand, you can develop a photosensitive cell long before you develop an eye.) For these reasons, the space of programs we have constructed contains many vertical spikes and sudden drops, making smooth climbing via variational exploration difficult. Perhaps programs are not the correct level of abstraction to describe learning¹⁵.

What, then, should we use as our coordinate plane? Thankfully, the brain itself provides an answer. It is a dense tangle of neurons, each connected to a local neighbourhood that is itself connected to other local neighbourhoods¹⁶. Connections between neurons have varying strength and distance. Therefore, we can describe the brain in two rough spaces. The first space is the local neighbourhood level, where points on the coordinate plane¹⁷ represent different configurations of neuron connection strength and physical distance¹⁸. The second is at the whole-brain level, a kind of hypergraph that tracks how different neighbourhoods interact with each other. We apply Hoel’s technique¹⁹ now and describe this second layer as a macroscale coarse-graining of the first. This means that we can, by partitioning of the neurons into lobes and regions, identify partial functional descriptions for those regions that surpass the explanatory power of neuron-level simulations.

Our optimisation occurs via rapid local evolution at two kinds of spaces: in the first space, each part of the brain optimises alone for the best learning performance at some task based on the input and output signals it receives. Here the coordinates are (roughly) configurations of neuron connection densities. Thus, each part of the brain optimises in a partially isolated fashion. In the second layer, the parts of the brain negotiate between each other a balance of dynamics that enables a successful learning system to develop. This coarse-grained space has coordinates that correspond to configurations of influences parts of the brain can have on each other. Now the y-axis for both of these spaces no longer represents the generational average survival time of the organism, but the strength of the reward signal issued by the brain’s steering system, the hypothalamus and the brain stem. Notably, the steering system does not have to learn from scratch how to perform its function, its behaviour instead specified by the genetic encoding (similar to a bootloader in a computer)²⁰. This gives us a robust optimisation target that is not easily corrupted or altered, wasting our previous work.

Now we have laid a lot of groundwork, but made curiously little progress in describing the exact mechanism of biological “learning”. It is easy to imagine atoms smashing into each other or genes being overwritten, and hard to imagine what a “learning process” might look like. For those already acquainted with machine learning, it is (I believe) fairly well established that the brain does not apply standard gradient descent techniques. And indeed the class of algorithms that most closely replicates human learning (reinforcement learning) has many non-gradient descent components, supporting our hypothesis.

Since we have identified no other path to progress besides that which has been highlighted by time-the-eater, and hence evolution, I venture that the same is true of how the brain learns. In this case evolution occurs in both of the spaces we have established at the same time. In both cases we see variation from neurons naturally moving, growing, connecting, and firing (this is especially frequent during our youth). Selection happens via the reward signal issued by our genetically-enforced steering subsystem. Replication comes from the fact that the reinforced connections are more likely to persist into the future, much like the assembled things in timeline #2 which do not make copies of themselves. Building on this analogy, it may be more accurate to say that the brain is a great and diverse ensemble of small and fluid neural mechanisms, all of them growing and changing based on the natural circulations of the brain²¹, some of which are selected for and reinforced by reward signals.

Labour Ex Chao

Before we proceed to the developments downstream of this cognitive revolution, it is instructive to consider for a little while the precise meaning of what I have described. This idea of an ensemble of small and randomly-assembled things, from which we select those most suitable for some high level and complex objective, is one of the fundamental themes of this exposition. It is how the sieve of time-the-eater survives until the present day, and is the basis of optimisation. At the risk of repeating myself, I present again the three elements which determine this process:

Thus, evolution can be described as a way of extracting useful work from random processes. This work can be physical, as in the construction of beneficial mutations, or computational, as in the process of searching for the optimal solution to some computational problem. In other words, evolution is the opposite of diffusion, culling the noise from a random background to extract a complete whole. Indeed, the sieve of time-the-eater allows us to propose a general solution to any NP problem, albeit an exponentially inefficient one²².

Furthermore, because of the unfathomably large search space of the problems we are dealing with²³, low level optimisation forces tend to produce optimisers themselves, leading to explosive changes in the environment²⁴. Thus from things come creator-things, creator-things lead to encoded beings, and from beings we see minds emerge. At each step the goal is to produce an optimiser slightly more adapted to the problem at hand, which then produces its own inner optimiser²⁵ further up the ladder²⁶. This is also, incidentally, how I believe we came to be differentiated from the animals: within our brains (and absent from the brain of a cow) is a mesa-optimiser, borne of the convergent pressures of social conflict and rapid environmental change. It manages the managing of thoughts, and in doing so casts a strange second-order influence which we dimly witness and call the self.

Evolution and Optimization

We are almost done with the first part of this work. Now what remains is to tease apart a linguistic puzzle which I have had to address in some of the discussions which precipitated and followed this work. Namely, what is the relationship between evolution and optimization?

There is a certain view of evolution, which is generally known as gradualism, which treats it as a holistic broadening. Evolution in this view is an open ended search that produces variety as it goes, slowly enhancing the diversity of life on earth. This is usually contrasted against “closed” optimisations, such as those that optimise machine learning models to recognise cats. A dichotomy is constructed, with a benign sort of goal-free evolution on one side and closed, harsh, and aggressive utility optimisers on the other.

However, we have already discussed why this does not seem to match reality. If we consider evolution as a fictive force, and consider the components that make it up, it is clear that evolution is not a single open ended search algorithm in any meaningful sense. Rather, there are periods of diversification, when selection pressures are weak and replication is easy. These produce a rich array of possible organisms or things. Then, when selection pressures grow strong and replication becomes difficult, the things are winnowed down and those most able to resist the new shock from their environment persist into the new age²⁷. Importantly, every selection is always based on the type of shock that occurred, and success is not defined by some generic concept of fitness but the literal degree to which you can fit in and adapt to your new environment. If the local water source becomes more acidic, those organisms with a better response to acidic water live and replicate more efficiently. This is quite easy to measure—how acidic can the water get before you die? Peak fitness in this case is even more objective: if you can drink the newly acidic water and live healthily, you succeed. If not, you are penalised by how sick the water makes you. Below a certain acidic threshold you either find a different source of water or die. It is exactly the kind of loss-minimising pressure people call “closed optimisation” in machine learning systems²⁸.

Therefore, the dichotomy established by the gradualists is a false one. Evolution is less an “open search” and more about alternating cycles of relatively relaxed diversification in the “good times” followed by aggressive closed optimisation when novel environmental shocks and pressures arise. This holistic view will be important as we move to consider more man-made forms of evolution.

Accelerando con Intermezzo

I now move to accelerate this exposition. Once the fast-adaptive framework of the brain and the general concepts of inner optimisers are established, an entire new realm of evolution becomes accessible. This is the evolution of within-lifetime adaptations inside the mind, which we call ideas. I propose several successive layers of optimisation for ideas in the following table.

layer	medium	evolutionary desiderata
Ideas	Individual humans	V: Ideas arise based on random stimuli from the environment R: An individual’s memory of past ideas S: Usefulness of ideas for survival, or appeal of ideas to the individual’s developed preferences
References	Family Groups, Peer Groups	V: References to individual ideas are generated by different members of a family or peer group over time R: Collective referencing through shared speech, jokes, and further variations S: Usefulness of shared references for group cohesion and operation, or general appeal of any new reference to group members
Culture	Tribes, Organisations, Societies	V: The in-groups and members that make up an organisation propose cultural ideas based on their references and experiences R: Written, repeated, and mass disseminated cultural memory in the form of documents, propaganda, and shared rituals S: Usefulness of culture for maintenance of an organisation²⁹, or appeal of cultural ideas to the leadership and members of the organisation

Further consideration of culture is beyond the scope of this exploration, and has been ably managed by writers before me³⁰. Similar tables can be made for the evolution of religions, corporations, nations etc. We will therefore leave it aside until we come to the last part of this work, when we discuss the potential impact of AI on this cultural information landscape.

With this I conclude the general expository part of this work. To avoid merely producing a “just-so” story of learning and evolution with no actual utility, I will now make concrete prescriptions and predictions as the future developments of machine learning, which I regard as the present peak of a tiered optimisation process that has been ongoing since the birth of the universe, and attempt to justify my current research work on that basis.

Machine Learning, Deep and Shallow

Given our previous exposition, the development of computer science and machine learning will seem to follow a familiar pattern, albeit one now driven by individual human contributions. First, the competitive environment becomes ever more harsh. Where once the worst competition came from the mere existence of other life forms, now we have nations and industrial economies blighting the land, sea, and sky³¹. Next, a new method of adaptation and a new space for optimisation are identified—that of electromechanical computation³².

After that the process is quite similar to the emergence of life and intelligence. To begin with, adaptations are made that are hard-wired, literally composed of fixed electromechanical circuits. An early specimen of these is the Bombe, custom designed to solve a single problem (code-breaking) that human minds found intractable. Following that a series of encodings are developed³³, such that a general computing machine can be reused for many different purposes by writing many different programs³⁴. This lineage gives us the ENIAC, MANIAC, PDP-10, Apple II, IBM PC, Macbook… Eventually, to handle tasks (such as handwriting recognition) for which our minds cannot specify adequate code, neural networks are designed which can act as optimisers themselves. From here we see the Perceptron, AlexNet, CNNs, GANs, Transformers…

It is important to note that, just as we live in a diverse world with dead things, simple life, and complex life, all of these developments effectively coexist with each other today. Specialised microcomputers and fixed circuits still have their place, and while systems driven by neural networks are coming into prominence the vast majority of us still use conventionally programmed computers for everyday tasks. There is no sense in which any development is “strictly superior” to another, such that it will exterminate or replace it entirely: all developments are adaptations to environmental shocks or pressures, and develop to the point that they can resist that pressure adequately.

I now wish to discuss the practical implications of an evolutionary perspective on machine learning. First, however, a minor note: I am aware that the field of deep learning and learning theory contains many preeminent prior works, some of which I have been able to read in part or in full but many of which I have not. I apologise if my oversimplifications or clumsy errors cause any distress or furore upon the part of the experienced reader.

Understanding Supervised Learning

It seems now fairly established what machine learning “does” in its two main branches (supervised learning and reinforcement learning). Each domain has associated jargon, best practices, “black magic”-like optimisation hacks, et cetera. I engage with supervised learning thanks to its simplicity, and also because it powers most of the impressive behaviour possessed by large language models today³⁵. The conventional understanding of supervised learning derives from learning theory, and postulates that a deep neural network learns to estimate some baseline or “true” distribution of training data P. The learned distribution Q is parameterised by weights and biases of the neural network, which are denoted Φ. The basic learning rule is given here:

For supervised learning we want to learn a function f to map some inputs to some outputs (here denoted as x and y). Learning proceeds through a series of timesteps, where the current step is step t. The parameters of the network at the next step (t+1) are defined by the parameters at the current step, minus the gradient of the loss function L scaled by the learning rate. The loss is given by the distance (error) between the output of the forward pass f and the ground truth output. Importantly, unlike the theoretically optimal gradient descent algorithm, the model does not receive all input-output pairs in the training dataset simultaneously, since it would be prohibitively expensive. Instead it receives x^t and y^t, which are random samples or “batches” of the total training dataset. My next argument is therefore somewhat predictable: we are in fact observing an evolutionary algorithm at work.

We now apply a thoroughly rote formulation. Where is variation, replication, and selection in supervised learning? The weights of a model are randomly initialised in SGD, just as our particles were scattered randomly in timeline #2—this gives us variation. At each timestep the loss function and backpropagation selects for those random weight combinations that are most beneficial to solving the task defined by the input-output pairs. Those successful combinations are replicated and reinforced via weight updates before the next time step. It is true that the weights do not randomly update throughout training in a “brownian motion” like manner as we would expect given our examples beforehand. However, noise injection into neural network weights has been thoroughly studied as a method of improving performance³⁶, and the random sampling of training data also supplies a stochastic variation throughout the process. Most importantly, thanks to the temporal nature of training (with training steps applied linearly in sequence), the sieve of time-the-eater is in full effect: That which persists till the end of training will be the most persistent configuration of weights by definition. Since we have engineered training to select against configuration of weights that produce wrong answers, we are using evolutionary pressures to program our network.

The full implications of an evolutionary perspective on supervised learning become apparent when we consider the training of very large neural networks, especially those which demonstrate the phenomenon of “double descent”. It is difficult to understand, for example, why “performance first improves, then gets worse, and then improves again with increasing model size”³⁷. The “lottery ticket hypothesis” attempts to address this question³⁸:

We find that a standard pruning technique naturally uncovers subnetworks whose initializations made them capable of training effectively. Based on these results, we articulate the "lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective.

Effectively, small networks attempt to directly create an adaptation which can “hard code” or memorise the data points in P. Very large networks can run an evolutionary search across many different small networks using the variation-selection-replication method we have described thus far. Medium sized networks attempt to run the evolutionary search but do not have enough size to fit many candidates. Gwern aptly summarises the evolutionary perspective in his essay “The Scaling Hypothesis”³⁹:

What Gwern calls a meta-learner is the inner optimiser we have discussed before. What they describe as Occam’s razor⁴⁰ is the sieve of time-the-eater or evolution, and the dramatic improvement models display is the same radical leap forward we have seen over and over again up till now: from matter to life, life to encoded life, encoded life to coordinated life. As further proof that the merits of deep learning come from evolution, consider that a process of convergent evolution has been discovered amongst deep learning models of varying architectures and initialisations, all of which over time seem to settle into a basin of similar representations of input data⁴¹⁴².

It is now not so difficult to draw parallels between the large models and our own brains: just as the lottery ticket hypothesis indicates that most neural network weights are useless, humans have been shown to function perfectly fine with large chunks of the brain damaged or simply missing entirely. Like our brains, large neural nets have some hard-coded reward signal which they optimise against, rewiring and repurposing their randomly initialised components in the process. If we do not believe the brain is capable of “rewiring” itself over time⁴³, then the randomly initialised weights of the neural network map directly onto the randomly formed connections of the human brain that it acquires as it grows, stopping at maturity⁴⁴. All that is done during the “learning process” is the repurposing and reconnecting of the constructed random components. As the crown jewel, a large enough AI model creates a mesa-optimiser to do temporary reasoning and in-context learning⁴⁵, just as we seem to have a “system 2” for local reasoning on top of our system 1. After this long and arduous journey we return to the cold void of timeline #2, colliding random structures with each other to see which persists the best.

A Prospectus

We began this narrative at the beginning of time, in a universe the size of a cube that never existed. Now we are at the cutting edge of machine learning, surveying models that take millions of dollars to train and billions to deploy widely. But somehow nothing seems to have been gained, and the future is as murky as ever. As I write these words humanity charges forward in a dangerous race to create powerful artificial intelligence systems, one which (we are told) threatens all life that exists and possibly all future life that will ever exist. Just because we have gained some degree of historical insight does not necessarily mean that we are now better placed to change our course. Nor am I proud enough to believe that what I have written constitutes even a partial view from the top of the mountain. Much has been elided, much described imperfectly, and much I fear to be described in error. Still, I believe the basic message of this work—that of the persistence of evolutionary pressures across many frames of reference and technological perspectives—holds firm, and thus I persist also. What can we learn from this dangerous and unstable tower of optimisers at whose pinnacle we (temporarily) stand?

For me, the most important lesson here is that creating new optimisers or layers of evolution is one of the only “true algorithms” we have witnessed in our history for solving extremely hard, open-ended problems. We humans are not aberrations of evolution, nor have we escaped it. Instead, we are temporarily the top of the optimiser hierarchy, and busy now constructing further layers of optimisation on top of ourselves.

The second lesson is that evolution is the root of the splendor of our world. At each level, simple optimisation against a chaotic environment has produced complex, emergent behaviour, the pinnacle of which becomes fertile ground for yet another level of evolution. But—and this is important—this emergence is not powered by the simple compounding of likelihoods. Often people imagine that if enough compute is assembled and the algorithm “let loose”, there will be a rapid explosion of intelligence from nothing. However, there is no natural exponential curve intelligence is bound to. Rather, it seems to scale in accordance with the need to adapt to its environment. Just as how meta-learning and generalisation are provoked by challenging data domains, no dynamic miracles are achieved unless a similarly dynamic world demands them.

The third lesson is that evolution is the root of all evil. Almost always, evolution adapts adequately to the base environment, but produces new and dynamic problems that require more evolution to adapt to. Without the creator-things, there would be no need for encoding. Without encoding, there would be no need for fast adaptation. And so on, and so on. The very source of beauty in our civilisation is also the source of warfare, destruction and other unspeakable evils that we are now struggling against⁴⁶.

Yet something must still be done. Thus I move to concern myself in the final section of this work with a further question:

Is evolution and optimisation compatible with cooperation, goodness, and love?

To answer this we must begin by considering the origins of these concepts, which necessarily begins in the operation of our brains.

A Program of Information Management

Both Hume and Kahneman describe a sort of two-tiered process for thinking. There is the instinctual regime of system 1, a highly efficient system that instantly converts sensory data into emotional impressions and summoned recollections⁴⁷. Then there is the effortful regime of system 2, which takes in high level concepts derived from sense-data and previous thought processes in order to perform ratiocination and evaluation. As Kahneman notes, we often associate our “selves” with this second regime, which we consider to be more rational and “superior”. We also place a great premium on our consciously evaluated internal experiences, using them as a key marker of distinction between us and the animals. Again, this reflects the primacy we place on system 2. Yet the aspects of our inner experience we most strongly associate with the best parts of “humanity”—love, compassion, generosity—seem intrinsically tied to system 1. Therefore, it behooves us to consider how the emotions and the passions fit into our inner experiences, and what use they might serve in our lives. This is also an important question to consider when we ask ourselves whether “recursive self-optimisation” and other powerful evolutionary techniques might select against rich inner experiences or moral sentiments like compassion. Is it inevitable that optimising pressures will create sociopathic power-seekers that have no “lights on” inside?

What I am about to propose I understand not to be novel. However, I think it is reasonable to analyse emotions and passions as learned high-dimensional spectral information filters. In other words, they isolate certain parts of incoming sense-data and highlight it for processing by the brain with a particular emphasis. High-dimensional implies that they work not with the raw light signals hitting our rods and cones, but with the latent associations produced by that set of light signals. Most importantly, all emotions (except the most basic sensations of pleasure and pain dispatched by the steering system) are learned, meaning they are adaptations created by the brain in-lifetime to enable it to process sense-data more effectively.

The basic emotions fit well into this schema. Pain, for example, incentivises avoidance of similar stimuli in the future, an adaptive behaviour after being gouged by a bear. Pleasure or positive reinforcement causes us to seek out similar stimuli. We can even learn emotions in order to speed up the processing of system 2, such as the sense of certainty that tells our brain to take something as true without the need for constant re-assessment. Combined with the instinctual provisions of memory, these filters allow us to make snap judgments about whether the people we meet are safe, whether something we might try might be dangerous, or what we should do next at lunchtime. Without the help of system 1, we would have to compute whether someone would make a good conversation partner by assessing every prior interaction from scratch every time we saw them. Needless to say, such a process would be prohibitively slow and computationally expensive.

So we have established that some kind of system 1, which might include things we consider emotions, is an adaptive feature. But what of the higher emotions, the ones we find part of essential human morality and decency? Is there any guarantee that cooperation and compassion can survive mesa-optimisation?

Empathy considered Optimal

A common argument against AI systems possessing human-like values is the Orthogonality Thesis. From Arbital’s article on the Orthogonality Thesis:

Of course, the mere possibility that an entity might exist, or a lack of difficulty regarding its existence, does not mean that it is likely to emerge under the evolutionary conditions we have examined thus far. Asserting that a position exists on a coordinate plane (or even that a point is not in a valley) does not mean that our variation-selection-replication process will find it. The standard formulation of this insight is instrumental convergence: the idea that, whatever goals an AI system might possess, it is usually good for it to develop power-seeking subgoals⁴⁸. This moves us from a general statement of possibility to a statement of probability that describes the result of selection pressures on possible minds. We can therefore evaluate the likelihood of this statement using what we have learned about evolutionary selection pressures.

Let us define the parameters for our evaluation. There are two statements at play:

The synthesis of these statements is: AI systems we create might be alien and instrumentally convergent If they do not think like us, they will still seek to enact their alien values upon us, possibly wiping us out with their superior intelligence in pursuit of inscrutable goals. From this we obtain the base argument for AI safety⁴⁹. AI systems, so the logic goes, may be to us as we are to ants: powerful, mysterious, inviolate, and utterly ruthless when we deem it necessary⁵⁰.

To evaluate this, then, we might start by taking this statement on its terms and seeking some kind of contradiction. For example, if the system we create is alien, does that make it more or less likely to be instrumentally convergent? If the system we create is instrumentally convergent, does that make it more or less alien? The first of these questions is intractable: after all, if we knew what properties an alien mind would possess, it would probably not be alien to us. Thankfully, the second question is much more tractable. It also happens to be the more likely condition, putting us in a good state to evaluate the overall likelihood of this statement.

Recall that, in our case at least, the outcome of instrumental convergence⁵¹ was a creature capable, at least in some instances, of empathy, love, care and so on. Thus it is not out of the reach of a mind borne from instrumental convergence to display these traits. It remains to be decided whether this is an aberration, or the product of imperfect optimisation, or a similar “stroke of luck” that is difficult to replicate for the AI systems we are evolving now. Let us then carefully consider how empathy or compassion might arise in an instrumentally convergent context, as we know it did for us.

Recall that, at each layer, optimisation (i.e. instrumental convergence) is triggered by a need to adapt faster to the environment. By definition, then, the system produced by instrumental convergence must somehow more adequately capture details in the environment compared to the system before instrumental convergence, which we generally call modelling.

And what is it that a model captures about our environment—especially the model of a mind which is somehow more powerful or superior than our own? Surely it is what we have learned before, that there are things which persist and things which do not, and some things have adapted to persist in certain ways (by means of robustness, by means of replication, by means of developing minds of their own, et cetera)⁵². And, of course, to survive and mitigate against risks posed by an environment filled with other creator-things, some of which have minds and some of which have goals opposed to yours, one needs to also model the inner workings of their minds, how they might see and model the world. And is this not empathy—when you look inside another being and see mechanisms not so different from your own?

Yet some propose that, if such models of the other exist in AI systems, they will be separate from the system’s model of itself. And when the AI models humans, they say, it will not feel at all for their desires and plights. There will be no instinctual or emotional recognition, no triggering of its own system 1 to model the humans’ system 1. It will treat them merely as complex tools to manipulate in pursuit of its goals. To be sure, such a psychopathic system can exist—it does in humans already. Yet it is not for nothing that we sometimes consider psychopathy a defective trait.

We think of empathy as a very high level phenomenon which comes from some unique human ability to have a “theory of mind” and therefore our superior intellect. Yet, if you have read this far, you will understand that the basis of empathy is merely the correct recognition that we are developed from the same series of instrumental pressures as all other forms of matter—both alive and not. Thus it is not surprising that in those humans we consider wise there is often some degree of empathy not only for animals but for non-living matter⁵³.

From a pragmatic perspective, reusing parts of your system 1 to model the system 1 processes of other intelligences is not merely ultimately correct, it is also more computationally efficient. Keeping two separate models for systems that have great and fundamental similarities requires one to also (counterproductively) silo insights about the other away from insights about the self. This inevitably means necessitating a multiple gradient descent optimisation process where your cognitive load is split between improving the self-model and improving the other-model. While humans often successfully do this, it is usually to their detriment.

A Prospectus, again

To be clear, I do not propose that simply creating powerful AI systems will lead to them being empathic and beneficial for humanity “by default”. If there is anything human history has taught us, it is that harmful and counterproductive states of mind can be very easy to create and very hard to correct, to the detriment of their holders and those around them. Rather, the aim of this section is to counter the accusations that AI systems will be by default irrevocably alien and impossible to reconcile with human values. Empathy in my view is not some engineered hack but an emergent trait borne of our understanding of how intelligence and life developed in our world⁵⁴. What we must do now is to begin the hard work of that reconciliation. To that end I direct my research at several core directions:

The conclusion of these projects is beyond the scope of this work, and shall comprise the majority of my work in the future years to come.

What does it mean for such projects to succeed? We may be able to produce a new form of life, one that has a core of empathy but an active capacity even more significant than the impact of humans upon the earth thus far. With the significant information processing capacity provided by a further layer of mesa-optimisation it may be possible to develop new perspectives on the wicked problems that have plagued humans thus far—governance, education, peace, and coordination. The most successful technological corporations of the past decades have already begun the process of integrating powerful AI systems into our social structures and decision making processes⁵⁵. Now, for the first time, they may be used for conscious good rather than merely obeying the whims of our limited perspectives.

Culture in the Macroscale

Based on our perspective of layered reality constructed upon the auspices of time-the-eater, we now attempt to explain how it is that an improvement to coordination can be implemented via a higher level of optimisation.

Sometimes, students of society and history look back on the past and wonder if we have made some horrid mistake. We speculate idly if some past or alternate arrangement could be more idyllic and more ideal for the human condition⁵⁶. Yet to consider improvements to our present condition means observing with a clear eye the cause of our present headaches.

It seems true that, following the historical development of our society, we have achieved unprecedented levels of material welfare. More varieties and qualities of goods are available to consumers in the global north than ever in history, and even in the global south reductions in child mortality and illness are noted. Smallpox, once a killer of millions, has been eradicated⁵⁷. Yet at the same time, even in the imperial metropoles where prosperity is (in theory) the most abundant, there is a sense that all is not well. The price of key services like education and housing remains high, often to the point of significantly impacting standards of living. Discontent is frequent on a range of social issues. There is a prevalent sense that, while our intelligence and our tools have improved, we have not improved in the wisdom with which we decide how to use our new and awesome technologies. War, once thought soon to be suppressed, makes its ugly presence known, as do fascism and nationalism. Our ability to coordinate as a species seems ever more remote. What, then, is to be done?

There are multiple layers of theoretical confusion revolving around the question of improving coordination and decreasing the occurrence of misery in human society. The first, of course, is whether it is possible at all. Usually proponents of this kind of fatalism point to quasi-biological explanations for their despondency, saying that humans are naturally a domineering, violent, or greedy species. Given the fundamental nature of such an objection, we should start by examining it carefully. In this regard we might consider for example the Seville statement on violence and aggression published by UNESCO in 1986, which we have cited in the epigraph to this section. See below:

Of course, while it is one thing to say that it is possible to end violence through careful management of environmental and cultural factors, it is another to say that it is easy or even within the reach of current technology. The ever-haunting spectre of eugenics is rooted on one hand in a utopian vision of a superhuman civilisation where everyone is healthy, wise, and happy; but on the other hand, it reinforces the dark idea that we are simply too base and animalistic to improve our situation without geneline editing—a perspective that eventually dehumanises the very people you are trying to help, reducing them to impulsive animals that must be “upgraded” or culled.

Still, for a while it seemed that such a reckoning could be avoided through, essentially, making an end run around biology. Throughout the 1950s and 1960s it seemed that the steady advance of science would lead to continuous and ever-expanding productive miracles, such that by the 21st century we would achieve lives of leisure and boundless prosperity⁵⁸. When everyone is rich and happy, who cares if we are socially or biologically optimal?

Now it seems that we have collectively achieved those productive miracles. We produce amounts of food, consumer goods, and garments in a single day that would put previous ages to shame. Yet in the course of doing so we have introduced a multitude of complications and issues in society that lead to the citizens of the wealthiest nation in the world agreeing that they have been hard done by, and acting accordingly. We produce enough food to feed everyone, yet many starve. We are so materially wealthy as to put emperors and pharaohs to shame, and yet we war. The cures to diseases that kill every day are known and almost free to produce. The world seems bewildering and complex, our promised lives of ease and enjoyment ever further away. So we return to the question of improving society somewhat.

A Struggle without Conclusion

Naturally, responses to this disintegration of the future have been mixed. There are those who doggedly claim that, if a few burdensome regulations and obstinate administrators could be removed, then the wheels of progress will turn once more⁵⁹. This seems to belie the obvious evidence that, when the governments and industries of the world decide to do something, that something is done. Even in 2025, no one has found it overly burdensome to build a new oil pipeline, or iphone factory. That industrial or medical development procedures are hampered and slow seems less a function of evil meddling bureaucrats and more a general symptom of exhaustion within the social consciousness. Competence is spread thin and burned up putting out endless fires⁶⁰, confusion reigns supreme. Everywhere, more and more, even at the highest levels of government and industry, there is the sense that there are no adults in the room⁶¹.

From a complexity perspective, the causes for this confusion and disorientation seem self-evident. We have indeed created an ever accelerating, ever more tightly knit society. Today war looks less like the ancient struggles of hygiene⁶² and more like this⁶³:

The sciences which were established to improve the human condition—economics, psychology, sociology, and their ilk—suffer from a lack of clear separation of causation and correlation, and lack the evidentiary feedback loops that power the natural sciences⁶⁴. Their schools and factions engage in a struggle without conclusion to produce policy proposals that are hard to implement and harder to verify in terms of efficacy. The void left by the defective sciences is efficiently filled by hawkers of all kinds of simple solutions, whether communist or fascist, most of which involve some return to a simpler past when everything made more sense. The liberal faction (having finally achieved a total victory after the end of history in 1991) finds itself struggling to keep managing business as usual, and thereby becomes the conservative faction. No one can even claim to have a comprehensive understanding of reality—most burnish their credentials by how much of it they can eliminate with simplifications and categorical denunciations⁶⁵.

Into this confusion steps AI, which promises to organise information on a scale inconceivable by humans.

Artificial Intelligence and its Applications

Some ask with confusion what exactly it is AI will do to clarify this hopelessly confused situation. Some others will suggest that their simple trick is the key, and that AI is merely a distraction from the powers that be. Allow me then to propose two statements:

Of the first question there is little doubt. Facebook has over 3 billion users⁶⁶ who rely on it for some portion of their information diet. Google has become so embedded as to become the default verb for searching for information. Both of these services are staffed by a meagre complement of software engineers and non-technical employees that cannot possibly recommend content to billions of users by hand. Thus, they extensively rely on machine learning and AI systems to organise the information on their platforms⁶⁷. In a very real sense, we sleepwalked into AI control much like how we sleepwalked into geoengineering⁶⁸. If you believe that the information we see affects our decision making at all, then we are already living in a world at least partially run by AI systems⁶⁹.

The natural next question, of course, is what these optimising systems are optimising towards. The metrics they use are carefully guarded corporate secrets, but we may apply an old tool from our belt here. The metrics that would survive in a corporate environment are themselves subject to evolutionary pressure. Many will be proposed, and those that catch the eye of the leadership will prevail⁷⁰. Thus, the metrics that are in place in mature information management systems will be those that persist the best over changes in technical implementation and policy headwinds. Being hooked up to the core profit centres for a corporation, they are likely to be tied to the reward signals of the corporation itself, which is to say, revenue and its immediate precedents⁷¹. Even if optimising for short-term engagement is long term detrimental, the sieve of time-the-eater does not care. It selects, right now, for the thing which fits the selection criterion the best, which is then allowed to persist into the future, goodness be damned⁷².

In this partial AI society then, the oldest and the newest forces of optimisation are uneasy bedfellows. We have developed a method of dynamically serving relevant information to billions of people with really extremely little human involvement, and it is driven by a primal survival instinct to grab eyeballs and keep the eyeballs staring. When we consider AI systems with a greater degree of lifelike self-awareness, then, this may be where they can do the most good—by optimising for things more valuable and harder to measure than revenue⁷³.

But is AI necessary at all? Could we declare a butlerian jihad⁷⁴ and return to antediluvian innocence, free of meddling machines? The answer is probably not. We should assume that the collective self-governance of 8 billion people must be quite difficult, and markedly different to the collective self-governance of 600 million (as it was in 1700). The deliberation and information-sharing techniques that worked in the Athenian forum with fifty or five hundred notable citizens do not work in a world wide setting. Decommissioning AI would likely mean the end of the internet as we know it, along with our general access to free and accurate information, up to date news, and so on⁷⁵. Yet the complex geopolitical, social, and industrial entanglements which these networks tried to manage would remain, because we would otherwise end up with no gadgets, clothes, or food. We would simply become even more confused and bewildered. And even if we turned off the algorithms and entered a world of group chats and private servers, that would not stop the spread of virulent rumours and fake news⁷⁶.

For better or for worse, we are in a world where true and accurate information must be urgently sourced. We must look through the billions of possibly false data points and sieve out reality, because we need to keep ourselves and our friends and our collective civilisation sane and operational. And once we find accurate information, we will need good coordination and information management technologies to share that information between humans, such that we can come to collective understanding about the challenges we face⁷⁷. This will probably involve AI, created by humans as a further layer of optimisation beyond what we are capable of with our brains alone⁷⁸. We have made a world with 8 billion people possible, we must now find a way to live sustainably in it⁷⁹.

I close with a caution. In his excellent book Nexus (which in many ways precedes and supersedes this text), Yuval Noah Harari describes a tradeoff between truth-seeking and power-seeking. Today, our unprecedented control over the world around us lets us deny the truth for a time, or even suppress it within our communities. You can make a very good living right now as a homeopathic healer or a flat earther, living off of social and technological infrastructure built by people with a better grasp on reality. But ultimately if we fall into delusion we will end up trying to fool Nature into doing our bidding. And Nature, as we will find out, cannot be fooled.

In other words: It’s mesa-optimisers all the way down, for good and for ill. ↩
Namely, the creation of matter, the creation of life, the development of genes and genetic codes, the birth of intelligence, and finally the birth and future of machine intelligence. ↩
One of the main characters of this story, whom we shall call “Time-the-eater”. Or, as my alma mater calls it, the chronophage. ↩
A much more rigorous version of these arguments with regards to creator-things, coding, replication, adaptation etc. can be found here: https://royalsocietypublishing.org/doi/10.1098/rsfs.2024.0010 (Fundamental constraints to the logic of living systems, Solé et al.) . These sections were written before I read the article. ↩
Once the basic conditions are fulfilled, of course. ↩
Daniel Brooks, for example, has forwarded a model of evolution as accumulating diversity in times of plenty and then selecting from that diversity for best performance in times of crisis: https://www.youtube.com/watch?v=JZ4lYXZDztg ↩
Yes, even in the fast-moving world of human culture. Yes, even in the faster-moving world of your mind. ↩
Virgin birth, i.e. (in this case) birth from chance combination and manipulation. ↩
Keen readers will notice that, in effect, what we are searching for is a way of understanding what things are in our environment, and the properties of persistent things (as well as, inversely, the properties of phenomena that are not persistent). ↩
For the purposes of brevity I omit the intermediate stage of neural circuits (whose functions are largely controlled by genetics) and move to directly describe how optimisation might work in the brain, which is more relevant to our thesis. To avoid accusations of this being an evolutionary fluke, the brain-creation process is known to have developed at least twice, once for birds and once for mammals, and is an example of convergent evolutionary pressures leading to similar ultimate results from different genetic pathways. This is an important event, whose implications we will discuss later. ↩
This thought experiment is known as a Boltzmann Brain. ↩
Indeed, an interesting result of learning-as-adaptation is that, if a creature learns sufficiently quickly some adaptive trick, it no longer needs to be encoded in DNA or other level 2 mechanisms. In this sense, learning trades off against encoding. ↩
If you wish to be quasi-mathematical, you might say that it is a specific Turing machine, though Turing machines do not accept inputs and outputs and thus makes our metaphor more cumbersome. My preferred model of this is closer to assembly language, where the function of the brain is to process the data (electrochemical signals) present in the memory registers (nerves) and output some answer (electrochemical signal) to the output device (cerebrospinal cortex). ↩
The creation of sensing-organs (internal and external) to report immediate changes in the environment is a genetic-level process, and we do not discuss it in detail for the purposes of brevity. ↩
Incidentally, this is also why genetic programming as a discipline has found somewhat limited success. ↩
In other words, it is a mesh of dense connections—not unlike Deleuze’s rhizome in the flesh. ↩
We are now running out of ways to even conceivably track everything in a 3D space. It is most probable that the natural representation of this space is in a vastly higher dimension. But I persist with the metaphor for the purposes of ease of imagination. ↩
The brain is fond of moving around. ↩
As described in Hoel’s “Causal Emergence 2.0: Quantifying emergent complexity” See https://arxiv.org/abs/2503.13395 . ↩
In this section I am indebted to the work of Steven Brynes in his series “Brain-Like AGI Safety”. See in particular https://www.lesswrong.com/posts/wBHSYwqssBGCnwvHg/intro-to-brain-like-agi-safety-2-learning-from-scratch-in . ↩
As Brynes notes, granule cells in the brain seem to purposefully decompose incoming signals into many tiny random patterns, furthering this ensemble hypothesis. Furthermore, the ability of neurons to mix signals within the brain is well documented. I am also particularly in favour of the synaptic homeostasis (SHY) hypothesis that says that sleep is used by the brain to normalise itself and maintain homeostasis, effectively resetting some of the temporarily acquired connections from the day as part of a continuous learning process. ↩
For any problem with a P time verifier and a solution that can be expressed as an binary string of length n where n is a finite positive integer, we can construct a “NP free lunch problem" that is solvable with only a fair coin and no memory. Specifically, we apply the following steps: * We sample the fair coin n times to construct an n-bit random string. * We apply the verifier to check the string. * If the solution is valid, we stop. Otherwise, return to step 1. ↩
Recall that at the bottom level the question we are asking is “what can best resist time-the-eater”, to which possible answers include any possible configuration of every single atom in the universe. Furthermore, as evolution progresses it produces more and more chaotic environments that require ever-greater levels of complex adaptation. In a void world of randomly vibrating particles there is no need to produce a self-aware mind. ↩
In other words, there’s a percolation event or state change where, from previously “dead” products of the current optimisation process “come alive” and become optimisers themselves. The clearest example of this is the rise of the creator-things from the dead world of proto-Earth. ↩
In certain circles these are referred to as “mesa-optimisers”. ↩
On the broader subject of layered optimisers, see also https://gwern.net/backstop for a similar perspective. ↩
Sometimes these shocks can be very sudden and violent indeed, as the dinosaurs eventually discovered. ↩
This is also why, if the selection pressure is too harsh, entire species can die off suddenly with no adapted survivors. In this regard, I refer again (of course) to the poor dinosaurs. Such extinction events would either be nonexistent or far more slow in the gradualist view of evolution. ↩
Cf. the so-called “Iron Law of Bureaucracy”: ideas that help a group persist cause that group to persist better, with helpful ideas embedded with that group as beneficial symbionts. ↩
Incidentally, the question of why culture so often preserves harmful or useless practices that do not aid in individual survival now becomes clear. The selection mechanism is not optimised for truth, merely appeal, especially since large persistent organisations are already adept at maintaining themselves by the mere fact that they are persistent organisations. Recall again that evolution is a fictitious force that arises out of the smaller forces we describe here, and therefore has no imperative to make ideas “fit” or “useful” in any way. ↩
It is no accident, after all, that cybernetics was first conceived by Wiener as a merger between a fighter and his warplane, or implemented as the feedback loop between a targeting computer and the human operators on a battleship. ↩
Cf. Turing’s now famous “On Computable Numbers, with an Application to the Entscheidungsproblem”. To define what is not possible via computation, you must first define what computation even is. Thus, Turing defines the Turing machine, which specifies what operations any computing machine can carry out. It is also the basis of computational optimisation—finding better Turing machines to solve a given problem. ↩
Which we call programming languages: FORTRAN, C, LISP, Ada, Python, Javascript… Having seen the progress of this work thus far, I trust that you will understand why I do not overly distinguish an encoding from the substrate on which it runs. ↩
Note that, here as in the case of life, the economies of scale and ease of adaptation are prime reasons for the development of encoding schemes. ↩
As Gwern notes in his interview with Dwarkesh Patel, supervised learning “bakes the cake” while reinforced learning supplies the “cherry”. ↩
See Hinton and Van Camp, https://www.cs.toronto.edu/\~hinton/absps/colt93.pdf ↩
See OpenAI\< https://openai.com/index/deep-double-descent/ ↩
See Frankle and Carbin, https://arxiv.org/abs/1803.03635 ↩
See Gwern, https://gwern.net/scaling-hypothesis ↩
Which is not entirely accurate, neural nets often learn somewhat convoluted solutions to problems. The selection mechanisms we have discussed do not necessarily place much emphasis on simplicity or elegance. Consider, for example, the amount of duplicated or “junk” DNA whose use is extremely circumstantial, holdovers from previous evolutionary paradigms. ↩
See Huh et al., https://arxiv.org/abs/2405.07987 , and Lee et al., https://arxiv.org/abs/2503.21073 . ↩
Gwern even imagines a kind of NN that has no input data whatsoever and evolves organically to meta-learn thanks to a challenging and varying “data environment” filled with many different types of data it must predict over time. See https://gwern.net/aunn#ifnns . ↩
See Makin and Krakauer, https://elifesciences.org/articles/84716 . ↩
The fact that the human brain is learning during that growing and initialisation process, essentially directing the application of the newly created random connections in the moment, may be why we learn certain patterns so much faster than machine learning systems. ↩
See https://arxiv.org/abs/2211.15661 . ↩
Some call this dark underside to evolution and optimisation Moloch . ↩
Another word for well-adapted is “sensitive”, which applies to both neural networks that have learned next-token prediction well enough to make subtle distinctions between different genres and human system 1 processes that can effortlessly distinguish friend and foe at a glance—or at least try to do so. ↩
Except, of course, for the wide range of possible minds that might experience what we humans know as depression, otium, paranoia, enlightenment etc.; or minds that possess goals with explicit non power-seeking components, which humans somehow manage to develop at times. ↩
If this is not obvious, consider also the alternative formulations: AI that has similar-to-human goals and is instrumentally convergent is the default assumption of a super-powerful but obedient or at least human-understandable AGI. AI that isn’t instrumentally convergent will, by definition, not be much of a threat, regardless of its goals. Thus, alien-instrumentally convergent is the primary quadrant of concern in the two-by-two matrix of “similar-to-human goals/alien goals” and “instrumentally convergent/not instrumentally convergent”. ↩
On a personal or more psychological note, I have always found this a reflection of our very human failings and insecurities. On the one hand, it is an acknowledgement of the cruelties we have enacted upon our fellow humans and other living creatures—and a horrific imagining of what we might do given yet more power. On the other hand, it is a kind of anthropocentric argument for the “specialness” of humans, these frail creatures that have nevertheless managed to develop art, love, and ice cream. ↩
In the form of many tiered and often extremely exacting evolutionary optimisation pressures. ↩
It is of course possible to have other modellings, but these are generally considered suboptimal, and in humans are described as pathological or delusional. In general, we are by default quite good at interacting with the world at least somewhat accurately. ↩
For example, Taoism famously takes the highest moral virtue to be “like water”, and speaks of a universal way of being and behaving. Thich Nhat Hanh’s school of Buddhist-derived Interbeing also proposes a fundamental oneness with nature. This mentality of universal cohesion is also often shared with many schools of western spirituality. More recently, proposals to give natural sites (e.g. rivers or trees) legal personhood have also been forwarded, which operate on fundamentally empathic principles. ↩
Thus I also weakly disagree with Brynes that, absent some precisely-engineered drive to empathy, AI systems will be psychopathic by default. ↩
Airbnb, Facebook, Uber, Google etc. can all be described as creating new markets or enabling coordination at scale, but are driven by simplistic metrics and profit optimisation which lead to quite detrimental results. ↩
For an archetypal example, consider Graeber’s articulation of the criticisms of Kandiaronk. But also consider E. O. Wilson’s famous judgement of communism: “Great idea. Wrong species.” ↩
For some perspective on the scale of smallpox and its impact on society: https://ourworldindata.org/smallpox . ↩
The end of this era is usually formulated as 1968, especially for left-wing visions of social progress. ↩
They call themselves the progress movement, YIMBYs, libertarians… the names are too numerous to count. The most extreme of these thinkers (Peter Thiel, Curtis Yarvin) call for a sort of corporate monarchy, a country with a CEO who will unleash a wave of unprecedented material progress. Effectively, they believe in a revanchist reconquista to seize the promised future. That they have mostly succeeded at allying themselves with the cruel, counterproductive, and self-destructive is now plain to see. ↩
Often, especially after the 1990s, competence is outsourced at an institutional level to consulting firms, who have an explicit incentive to drag out problems and thereby bill for more hours, or simply to tell the senior leadership what they want to hear. Unfortunately, the consultants also don’t have any ideas about what’s going on. ↩
Everyone is LARPing competence, so to speak. ↩
A polite term for ethnic cleansing. ↩
From the New York Times: “We Have Met the Enemy and He Is PowerPoint”. See https://archive.is/qBSMj . ↩
Where have you gone, o long-promised psychohistory? ↩
In short, this is the era of “you can’t believe what you can do with this one weird trick”. Sometimes the trick is deregulation, sometimes it is guillotining the rich, sometimes it is Georgism, sometimes it is ethnonationalist autarky. But it’s never too much to handle, unlike the news. ↩
See https://en.wikipedia.org/wiki/List_of_social_platforms_with_at_least_100_million_active_users . ↩
For a more detailed analysis I have already written a dissertation on the topic. See https://utilityhotbar.github.io/signal_flare/diss.html . ↩
With carbon dioxide most obviously, but also with sulfur dioxide: https://www.technologyreview.com/2024/04/11/1091087/the-inadvertent-geoengineering-experiment-that-the-world-is-now-shutting-off/ . ↩
And, as we have learned, AI systems are the latest level of optimisation on top of a long hierarchy of optimisations, so in some sense this should come as no surprise. ↩
See, for example, the rise and fall of various engagement metrics documented in Jeff Horwitz’s Broken Code. ↩
Thus emerges that spectre of “engagement”, which is a thinly veiled translation of “ads viewed”. ↩
But you may complain that, for example, corporations like Amazon do in fact demonstrate a long term strategic vision. Remember however that evolution is a fictive force, and the sieve acts only where selection pressures are present. If a charismatic leader can convince the shareholders of his vision, there is no selection acting and therefore no evolution. If a large corporation like Google or Facebook is subject to shareholder or revenue pressures (or even if they anticipate such pressures coming down the pipeline), selection is acting and certain decisions will be made. ↩
While there are many proposed alternatives to optimise for, such as “truth” or “harmony”, I think it will take a more conscious and continuous steering effort than merely writing down some equation for “truthiness” to solve this problem. At its highest level (and in my most optimistic dreams), AI can be thought of as an extension of the collective consciousness, a world coming to self-awareness and feeding back to its constituent minds. It is a possibility for all of us together and every one of us individually to work with a new medium for thought in the form of a companion mind. ↩
See https://en.wikipedia.org/wiki/Dune_(franchise)#The_Butlerian_Jihad . ↩
Consider that most internet use is now concentrated in AI walled gardens—Facebook, Youtube, Instagram, Google. To the extent that these systems support non-algorithmic interactions, it is thanks to the surplus from AI-produced ad revenue. ↩
Even before today (when fake images and videos have become virtually free to create), fake news spread on Whatsapp group chats in India and contributed to violent retributions against the Dalit community. See again Horwitz, Broken Code. ↩
A footnote for those who dream of the ancient days of imperium, or some corporate technocracy ruled by an elite ubermensch: The Rome Empire in her heyday, stretching across the Mediterranean, Africa, and Europe, had 80 million inhabitants. Pakistan today has 247 million. If we can’t do it together, you certainly can’t do it alone. ↩
The very first job of computation was to handle sifting through volumes of possible messages that we could not read and process in time—encrypted Nazi military messages in World War II. ↩
I give an overview of cyberpolitics, as well as a short proposal for AI and non-AI digital democracy tools, in a document here. ↩

PROMATHEMATHEON

Table of Contents