Arash Mehrjou

The Value of Art in the Age of Generative AI

2025-12-26T00:00:00-08:00

"To see something as art requires something the eye cannot descry—an atmosphere of artistic theory, a knowledge of the history of art."
— Arthur Danto

Why do we find certain works of art valuable? The usual answers (beauty, originality, emotional impact) are true, but incomplete. Beneath them lies a quieter mechanism, one that often operates unconsciously. When I look at an artwork, I imagine myself in the place of its creator and ask a simple question: what if I had to make this? How much effort would it take? How many years of training, how many failed attempts, how many deliberate choices?

We value art by running the thought experiment of creating it ourselves. This is not about the artist’s actual labor, but about perceived difficulty. Value emerges from a counterfactual simulation of how costly it would have been for me to produce. There is a second ingredient: art feels valuable when it seems unlikely, when it feels like something that could not easily have existed. Given the tools and resources available, what is the probability that this particular artifact would come into being? This probability depends critically on the tools available at the time of creation. Some pieces hang in museums because they were unexpected and hard to produce a thousand years ago, when the available tools were primitive. The same piece, if made a hundred years ago with more advanced tools, would carry far less artistic value. Value comes, in short, from how hard it feels to make and how unlikely it seems to exist given the technological context of its creation.

These intuitions can be made explicit in a rough model of artistic value. Let $A$ denote an artwork, $T$ the available toolset (paintbrushes, cameras, or AI models), $L$ the perceived labor or effort if human-made, and $N_T$ the expected number of artworks producible by $T$.

The model has two components:

\[V(A) = \lambda f(L) + \mu \big[-\log P(A \mid T) - \log N_T\big]\]

The Effort Component: $f(L)$ captures how we value perceived difficulty. This resonates with labor theory (Marx, 1867), but it is fundamentally subjective: a cognitive simulation of the creator experience, rather than actual labor cost (Kant, 1790). When we encounter an artwork, we mentally reconstruct the steps required to produce it: the years of training, the failed attempts, the deliberate choices. The function $f$ might scale as $L^\alpha$ for some $\alpha$, reflecting how human valuation responds to perceived complexity and mastery. This counterfactual effort estimation is what gives art its perceived value, independent of the artist’s actual labor.

The Rarity Component: $-\log P(A \mid T) - \log N_T$ borrows from information theory, but connects more deeply to algorithmic information theory. The term $-\log P(A \mid T)$ measures creative surprise: how improbable the artwork is given the tools. In algorithmic complexity terms, this relates to Kolmogorov complexity. Low-probability outputs, those requiring high complexity to generate given the model, stand out as surprising and thus valuable. But this is diminished by $-\log N_T$, which accounts for the abundance of possible outputs. When everyone can generate endless art, uniqueness is lost. This is what Walter Benjamin called the aura: that quality of uniqueness and embeddedness in time that diminishes when reproduction becomes trivial (Benjamin, 1935). Even if technique or content is aesthetically compelling, the aura collapses when the space of possible outputs becomes effectively infinite.

Here is where AI infrastructure investment matters. Consider the massive capital poured into training models: data centers, compute clusters, engineering teams, and energy costs. This investment $I_T$ is amortized across all possible outputs. If a toolset can produce $N_T$ artworks, the per-artifact infrastructure cost approaches $I_T / N_T$. For generative AI, $N_T$ is effectively infinite, since the model can produce countless variations. As $N_T \to \infty$, the amortized cost per artifact $\to 0$, and $-\log N_T \to -\infty$, driving the rarity component toward negative infinity.

This framework brings rigor to artistic valuation through counterfactual effort and generative rarity. Philosophical accounts acknowledge that artistic value isn’t reducible to monetary or labor inputs. They involve aesthetic, moral, and expressive dimensions. But by formalizing the cognitive mechanisms of valuation, we can see why certain works feel valuable and others do not. The framework explains much of traditional art valuation, but it breaks when confronted with generative AI. The perceived effort collapses to near zero (often just a prompt), and the rarity term collapses as $N_T$ becomes effectively infinite.

Generative models undermine both pillars of value. The perceived effort is minimal, often no more than a prompt, and the rarity evaporates as the space of possible outputs grows effectively infinite. From this perspective, AI-generated art appears destined for near-zero value, and yet this conclusion feels wrong. Not all art that looks simple is worthless.

Simplicity With History vs. Simplicity Without

Consider Kazimir Malevich’s Black Square or Marcel Duchamp’s Fountain. These works are almost trivial in execution. Anyone could paint a square or place a urinal in a gallery, yet their value is immense.

Kazimir Malevich, Black Square, 1915

Marcel Duchamp, Fountain, 1917

Why are these works so valuable? Because their simplicity is backed by history. They carry biography, rebellion, institutional confrontation, and cultural rupture; they changed the trajectory of art itself. Their value lies not in effort or visual complexity, but in their aura: their embeddedness in time, society, and meaning.

An AI-generated image, no matter how visually impressive, usually lacks this trajectory. It appears fully formed, without struggle, intention, or rebellion. To capture this missing dimension, the value model must be augmented:

\[V(A) = \lambda f(L) + \mu \big[-\log P(A \mid T) - \log N_T\big] + \nu H\]

Here, $H$ represents historical and societal weight: narrative, institutional recognition, cultural impact, and accumulated meaning. Human minimalist art may have low $L$ but high $H$, while AI-generated art typically has both low.

A Case Where AI Art Gains Value

Still, AI-generated art can have value, just not in the traditional sense.

Imagine a teenager experimenting with a generative model late at night, typing:

A dog solemnly pondering a chessboard, painted in the style of Caravaggio.

Generated by GPT-5.2

They post the image online without much thought. Somehow, it resonates. People remix it. It becomes a meme, appearing on protest signs, profile pictures, and T-shirts. For a brief moment, it captures a collective mood.

In this case, the value of the image does not come from the effort of its creation, nor from its rarity or artistic mastery. It comes from what happens after. The artifact becomes a vehicle for coordination, expression, and shared meaning. Its value lies not in the object itself, but in its ability to be taken up by people and embedded into action, discourse, and collective identity.

The artwork functions less like a finished piece and more like a catalyst. Its significance emerges through reuse, reinterpretation, and circulation. What matters is not who created it or how difficult it was to make, but whether it becomes part of a broader social movement, whether it is adopted, repurposed, and woven into human activity.

This is a fundamentally different kind of value: retrospective and relational rather than intrinsic. The artifact does not carry meaning on its own. Meaning accretes through human response.

"The meaning of a work of art is not exhausted by the moment of its creation."
— Hans-Georg Gadamer

The artifact itself is cheap, but the meaning is not. This kind of value, however, is fragile and fleeting. Generative AI does not produce occasional oddities. It produces everything, all the time. When novelty becomes continuous, disruption becomes rare. When everything is possible at once, almost nothing feels disruptive. If Duchamp’s Fountain appeared today, amid billions of absurd AI-generated provocations, would it still shock? Or would it simply disappear into the noise? This is the paradox of AI art: its infinite productivity may erode the very conditions required for cultural rupture.

A New Kind of Value We Cannot Yet See

Yet it would be premature to conclude that AI art is valueless. It may be that AI will create an entirely new kind of artistic value, one that does not resemble painting, sculpture, or even memes. Perhaps human-made art will become boilerplate, much like hand-written code once did, and the higher level of art may lie in orchestration: designing and connecting vast, complex structures of meaning that remain difficult even for AI to reproduce.

The deeper risk is that our brains may not yet be ready to perceive such value. Throughout history, artistic appreciation evolved alongside human cognition; our capacity for abstraction, symbolism, and interpretation grew gradually. AI’s generative abilities, however, are advancing far faster than our perceptual and conceptual frameworks. It may already be producing artifacts of potential value that we cannot yet recognize as art. When creation itself is cheap and infinite, value must be rediscovered elsewhere.

AI-generated art does not fit comfortably into our existing value systems. It lacks effort, rarity, and historical depth, though it can still acquire value through human attention, memes, and narratives. But these are fragile, easily saturated, and unlikely to reshape culture at scale. The deeper question is not whether AI art has value today, but whether it is pointing toward a form of value we have not yet learned how to see. Art has never been just about objects; it has always been about the human stories wrapped around them. AI simply reveals this truth more starkly by showing us what art looks like when the story is missing.

References:

Benjamin, W. (1935). The Work of Art in the Age of Mechanical Reproduction. Available online
Kant, I. (1790). Critique of Judgment. Available online
Marx, K. (1867). Capital: A Critique of Political Economy, Volume I. Available online

"The real voyage of discovery consists not in seeking new landscapes, but in having new eyes."
— Marcel Proust

The Randomness of Meritocracy Under Overload

2025-08-26T00:00:00-07:00

"We are prone to overestimate how much we understand about the world and to underestimate the role of chance."
Daniel Kahneman

Lately, I’ve seen many complaints about AI conferences, overloaded review systems, shallow or inconsistent reviews, artificially tuned acceptance rates by ACs, and endless debates about fairness.

None of this surprises me. The issues stem from fundamental incentive problems baked into our publication system. While conferences attempt to patch the process through new constraints, penalties, and workflows, these surface-level fixes cannot repair a system that’s fundamentally unstable. And this pattern extends far beyond publications, it’s a universal feature of any selection system overwhelmed by external incentives that exceed its processing capacity.

The uncomfortable truth is that when a selection system is overloaded and its evaluations are not strongly predictive of what we truly value (research quality, innovation, long-term impact), the outcomes approach randomness. Decisions start to look like lotteries, even if they’re dressed up in complex rules, multi-step workflows, and scoring systems.

The Disguised Lottery

We can simply express the problem via probability theory. Let each candidate have a true score $Y$ (e.g., future research output, innovation, job performance). Our evaluation process produces an observed score

\[S = f(Y) + \epsilon,\]

where $f(Y)$ is the systematic part of evaluation and $\epsilon$ represents noise.

Three conditions matter:

Power: The evaluation procedure (depth of reviews, quality of interviews, robustness of scoring) must have sufficient statistical power to reliably distinguish candidates. Low-power tests inflate error rates and reduce meaningful differentiation.
Signal: The variance explained by $f(Y)$ must dominate the variance of $\epsilon$. $\text{Signal-to-Noise Ratio (SNR)} = \frac{\mathrm{Var}[f(Y)]}{\mathrm{Var}[\epsilon]} \gg 1$
Alignment: The correlation between $S$ and the true outcome $Y$ must be strong. $\rho(S, Y) \approx 1$

If any of these fail: low power, low SNR, or weak alignment, rankings collapse into noise. The probability that the top-ranked candidate truly outperforms a random candidate (from the pool of reasonable candidates) approaches 0.5, no better than a coin toss.

It’s not an unknown effect and has been shown by several studies in different fields. For example, Chernev, Böckenholt, and Goodman (2015) show that in choice overload, decision accuracy degrades as noise grows. However, apart from this mathematically obvious fact, there is a psychological and sociological factor to it.

Why Society Resists Admitting the Randomness

Admitting randomness undermines the story of fairness and meritocracy. Instead, we build elaborate scoring systems that feel structured but lack statistical reliability. This resistance stems from deep psychological and institutional forces that have been well-documented across multiple disciplines.

Theodore Porter’s Trust in Numbers (1995) shows how institutions adopt numerical methods not because they enhance accuracy, but because they appear objective and legitimize decisions. Kahneman, Sibony, and Sunstein’s Noise (2021) demonstrates our tendency to seek patterns and causal narratives rather than accept randomness. When faced with unpredictable outcomes, institutions layer procedures and scoring rubrics to appear consistent and fair, even when statistical evidence shows the results are effectively random.

Research on procedural justice and system justification theory reveals that people care deeply about the fairness of the process, often more than the outcome itself. Even symbolic gestures: having a “voice,” transparency, structured evaluation, can satisfy concerns about fairness and reinforce trust in the system without necessarily improving predictive accuracy. The paradox is that these mechanisms protect the system’s credibility while doing little to improve its statistical reliability. We end up with increasingly elaborate procedures that feel more legitimate but produce outcomes that are statistically indistinguishable from random selection.

Familiar Systems That Feel Meritocratic

The mathematical framework I’ve outlined applies to numerous selection systems that we intuitively believe are meritocratic. Consider PhD admissions: when the correlation between admission criteria and future research success approaches zero ($\rho(S, Y) \approx 0$), admitted students become statistically indistinguishable from randomly selected ones. The elaborate application processes, recommendation letters, and standardized tests create the illusion of precision, but if they fail to predict what truly matters: research creativity, persistence, and long-term contribution, then the entire system operates as a sophisticated lottery.

Paper review systems face similar challenges. When reviewer scores are inconsistent or reviews become shallow due to overload (low statistical power), the signal-to-noise ratio plummets. What emerges is not a reliable ranking of research quality, but rather a noisy ordering that could easily be generated by random assignment. The multi-stage review process, with area chairs and program committees, creates procedural complexity that masks the underlying statistical instability.

Hiring processes, particularly in competitive fields, often fall into the same trap. Without validated predictors or structured, high-quality interviews that actually correlate with job performance, selecting “top talent” becomes no more reliable than drawing names from a hat. Yet organizations invest heavily in elaborate interview processes, case studies, and assessment centers that feel rigorous but lack the statistical power to meaningfully distinguish candidates beyond random chance.

The Feedback Loop Comes to the Rescue

The good news is that overloaded systems may recover naturally. A useful analogy comes from economics: financial markets often swing between bubbles and corrections. When speculation detaches asset prices from fundamentals, markets appear chaotic. But over time, negative feedback (e.g., losses, capital flight) forces a correction and restores some level of alignment between value and price.

Selection systems behave similarly. Over time, those who assign values (academic positions, industrial hiring, etc.) recognize that their “choices” are indistinguishable from random. But not all entities recognize this at the same pace. Those with higher tolerance for noise, often actors without direct skin in the game, are the last to act. Smaller startups already put less weight on publications. Larger corporations, with more inertia, take longer to adapt. Publicly funded academia, often shielded from immediate consequences, may be the last to react.

As participants move away from this low-SNR environment, submission pressure decreases, the burden lifts, reviews deepen (only truly motivated submitters and reviewers remain), and the signal-to-noise ratio improves.

These systems oscillate: they swing between phases of functioning and breakdown. Unless we rethink the incentives, not just the mechanisms, we will keep repeating the cycle with slightly fancier patches.

References:

Chernev, A., Böckenholt, U., & Goodman, J. (2015). Choice Overload: A Conceptual Review and Meta-Analysis. Journal of Consumer Psychology, 25(2), 333–358.
Kahneman, D., Sibony, O., & Sunstein, C. R. (2021). Noise: A Flaw in Human Judgment. Little, Brown Spark.
Porter, T. M. (1995). Trust in Numbers: The Pursuit of Objectivity in Science and Public Life. Princeton University Press.

Micro-Singularities: When the Future Arrives One Mind at a Time

2025-08-23T00:00:00-07:00

"Who looks outside, dreams; who looks inside, awakes."
— Carl Jung

The idea of an AI singularity is usually described as a future moment when machines become smarter than humans and everything changes at once. Thinkers like Nick Bostrom imagine it as a shared experience, something big and dramatic that happens to all of us together.

But from what I see, that might not be how it works. I believe the singularity can be personal. It doesn’t have to hit everyone at the same time. For some people, it may have already happened. For others, it may still feel far away.

Think of a researcher who works with advanced AI every day. They might see language models solving unfamiliar math problems, generating plausible scientific hypotheses, designing molecular structures, or reasoning about code they’ve never seen before. Tools like AlphaFold, GPT-4, or smaller open-source models fine-tuned for specific domains can now perform in ways that would have seemed like science fiction just a few years ago. At some point, the person working with them realizes: this is no longer just a tool. That quiet moment, when the future suddenly feels closer than expected, might be their singularity.

In philosophy, Thomas Kuhn described paradigm shifts in science not as smooth transitions but as discontinuities, sudden breaks in how we understand the world. In a similar way, today’s AI advances may already be creating subtle ruptures in the minds of those who truly grasp their implications. I call these moments micro-singularities. The shift from thinking of AI as a tool to recognizing it as a kind of reasoning agent can be deeply disorienting. It is not unlike when Copernicus realized Earth was not the center, or when Darwin understood that the human mind was shaped by evolution. For some, that shift has already happened.

At the same time, many people continue with life as usual. They might use AI to draft an email or write a few lines of code, but nothing feels that different. Some junior engineers still treat these models as autocomplete on steroids. And outside of tech, most people don’t feel any urgency at all.

This difference is important. I think the singularity, or at least the feeling of it, depends on how closely someone works with these systems and how deeply they understand what they can do. It is less about access and more about perspective.

Bostrom’s version of the singularity focuses on a global shift, with major risks and sudden changes. I do not disagree with the direction, but I think the experience is more fragmented. Some people are already on the other side. Others have not crossed that line yet. And some may never feel it until it affects them directly.

So maybe the singularity is not just a single moment. Maybe it is a series of personal turning points. For each person, it begins the moment they truly understand what these systems are becoming. And in that sense, the singularity is already here, just not for everyone yet.

TypedBio: A Functional Type Theory for Biology and Drug Discovery

2025-06-18T00:00:00-07:00

"Well-typed programs don't go wrong."
— Robin Milner

Biological systems are remarkably complex, exhibiting intricate interactions across multiple scales, from genes and proteins up to cells, tissues, and organs. The ways we usually describe and manipulate these systems often lack the precision and structured reasoning you find in programming or mathematics. The descriptive nature of the knowledge in biology could be one of the reasons that hinders reasoning in biological domains such as drug discovery where the goal is to find optimal interventions to steer a biological system towards a desired state.

Lately, I’ve been thinking about the limits of reasoning in biology, how it’s currently done, and what the ideal state might be. If you take mathematics as an example, Gottlob Frege’s introduction of a formal language and notation in 1879 was a monumental step, accelerating discoveries and paving the way for automatic proof assistants.

Having a formal language to study a subject provides multiple benefits: it forces authors to state their axioms up front, makes every logical step explicit and verifiable, and makes it easier to generalize patterns and reuse laws. But my favorite benefit, and indeed the main motivation behind this article, is that formal syntax allows a field to study itself. This meta-level reasoning led to discoveries like Gödel’s incompleteness theorem, which relies on treating mathematical proofs as manipulable strings.

I’ve been exploring an idea I call TypedBio with the motivation to turn the essential part of a biological system into a language that is minimal but powerful in capturing the relevant biological processes: What if we treated biological entities and processes as explicitly typed objects and composable functions, much like in typed programming languages? I mainly draw inspiration from functional programming and dependent type theory to suggest a perspective that might offer a new way of thinking rigorously about biology and medicine.

Before diving into the biological side, let me set the stage with a bit of background. Type theory, at its heart, is about classifying things. Every mathematical object belongs to a type, and types act as a kind of guarantee that things behave as expected. If you’ve ever written code in a language like Haskell, or you’re one of organized programmers who care about type annotation in Python, you’ve seen how types can catch mistakes before they happen. For example, you might have an Int for integers, a Bool for true/false values, or a String for text. Functions themselves have types too, like Int -> Bool, which means “a function that takes an integer and returns a boolean.”

Dependent Type Theory offers more expressiveness by letting types depend on values. Imagine a vector whose type actually encodes its length, so you can only add vectors of the same size. Or a function whose output type changes depending on the input. Such cases happen so frequently in programming languages and also mathematical proof assistants that DTT is now a norm in many of these systems. If you’re interested in knowing more about type theory, LessWrong has a few interesting blogposts on the topic such as this one explaining its building blocks.

A close concept to type theory is functional programming, whose main idea is treating computation as the evaluation of mathematical functions. You build up complex operations by composing simple, pure functions, no side effects e.g. print(“something”), and no change of environment. It’s a style that encourages clarity, safety, and composability. Basically it’s a clean modular way of describing processes (aka computations).

"In a pure functional language, you can look at a function and know that it does nothing other than return a result. That is a tremendously valuable property when you're trying to understand or reason about a program."
— Simon Peyton Jones

Such clean modular presentation of computation allows complex reasoning to become just a mechanical manipulation of functions. What I envision is to compress biological knowledge into a similar structure and make it amenable to general reasoning procedures similar to those that are carried out in mathematics and programming languages. This compression is of course lossy, i.e., a lot of biological and chemical details are dropped, but this is the same as with any processing engine including human brain. When physicians look at patient’s data and make a decision, they make decisions based on a sparse subset of data not the entire biological state of the patient. The question is then how to compress data in a way which is most useful for the downstream task. A structured language such as TypedBio can be thought of as a compression of processes that are most useful for reasoning tasks in biology. In this framework, each biological component, whether it’s a gene, a protein, or a cell, has its own formal “type,” just like in mathematics or programming. This means you can talk about a Gene, a Protein, a Cell, a Tissue, or an Organ as distinct, well-defined entities in your model.

One of the main advantages of typed systems is their convenience in describing the relationship among different types, which correspond to different biological entities in TypedBio. For example, we can have a function that takes a Gene and tells you how it affects a Protein:

perturbGene :: Gene -> Effect Protein

Or a function that models what happens when a drug targets a specific protein in a cell:

targetDrug :: Drug -> Protein -> Effect Cell

Or a function to model the outcome of clinical interventions:

intervene :: PatientProfile -> Intervention -> ClinicalOutcome

Or functions that measure the outcome of interventions:

measureBiomarker :: ClinicalOutcome -> BiomarkerMeasurement

As you see, functions are first-class citizens in TypedBio. These functions are more than just a line of code, rather they serve as a way of capturing the logic of biology in a precise, checkable form. For instance, perturbGene takes a gene as input and returns an effect on a protein, letting you reason about gene expression in a structured way. targetDrug is a function with multiple arguments: you give it a drug and a protein, and it tells you what happens at the cellular level. intervene models what happens when you apply a treatment to a patient, and measureBiomarker lets you turn clinical outcomes into quantitative data.

There is though a natural limit to the expressiveness of this approach which is in particular visible in living systems. In biology, context is extremely important, e.g., a drug might work for one patient and do nothing for another, depending on genetics, dosage, and other factors which we may or may not be aware of. To cover such cases, we need types that change based on the value of other types. For example, the outcome type of an intervention may be different in patients with different genetics. Such expressiveness is offered in Dependent Type Theory that I introduced earlier in this article. With dependent types, you can write a function like this:

administer :: (dose : Dosage)
           -> (drug : Drug)
           -> (patient : PatientProfile)
           -> TreatmentOutcome

Here, the type of the outcome can depend on the actual dose, the specific drug, and the patient’s profile. This lets us encode real-world constraints and context directly into our model which reduces the reasoning space to the set of feasible processes.

Let’s make this more concrete with a clinical scenario. Imagine you’re designing a treatment plan for a patient with lung cancer. You start by defining the patient’s profile, including age, diagnosis, and genetic markers. Then you lay out a sequence of interventions: maybe gene therapy targeting P53, surgery to remove the tumor, and a round of chemotherapy with a specific drug and dosage. In TypedBio syntax, this process might look like this:

-- Patient profile
patientB :: PatientProfile
patientB = { age=60, diagnosis=LungCancer, geneticProfile=[GeneP53, GeneEGFR] }

-- Interventions
geneTherapy = GeneTherapy { targetGene = GeneP53 }
surgery     = Surgery     { organ = Lung, surgeryType = TumorResection }
chemo       = Chemotherapy { drug = "Cisplatin", dosage = "75mg/m2" }

treatments = [geneTherapy, surgery, chemo]
outcomes   = map (intervene patientB) treatments
biomarkers = map measureBiomarker outcomes

What’s happening here is a kind of pipeline: you have a list of treatments, you apply each one to the patient to get a list of outcomes, and then you measure biomarkers for each outcome. It’s a precise, type-safe way to model the clinical workflow, and it ensures that every step is compatible with the patient’s profile and the logic of biology. This is obviously a simple workflow and can easily be described with natural language. However, the structured language is scalable to procedures with arbitrary number of steps, parallel processes and with many entities, where the description in natural language can quickly become intricate and as a result less actionable.

Currently most biological data are stored as spreadsheets, ontologies, or ad hoc scripts. A standard language like TypedBio offers a few big advantages. First, the type system acts as a safety net, catching errors and enforcing constraints automatically. Second, the functional approach makes it easy to compose and reason about complex interactions. And third, because everything is formalized, you can actually prove that certain treatment protocols are safe or effective using general reasoning engines such as Lean.

Although TypedBio can disambiguate biological data and inter-personal communication, my main motivation is making biology a playground for AI-augmented reasoning agents. Most biological data is messy and unstructured, scattered across pre-clinical experiment assays, EHRs, clinical trials, and research papers. TypedBio gives a way to translate all that information into a formal language that AI can reason with. Imagine using large language models to map legacy datasets into TypedBio syntax, then running AlphaZero-style algorithms to search for optimal treatment strategies. RL-augmented search methods e.g. AlphaZero needs a clearly defined action space. Such structured action space is a natural product of distilling biological knowledge in TypedBio to produce a well-defined and mostly finite action space, turning reasoning in Biology similar to traversal of a game tree. Notice that this TypedBio concerns with the representation of biological states, their relationship and actions (interventions). A full analogy to game-like scenarios requires a well-defined reward function which is beyond the representation method.

To make this practical, we need a way to bridge natural language and formal types. There are currently tools that facilitate such translation e.g. OpenAI’s function-calling API or Pydantic AI framework that ensures agents respect their output type definition when producing tokens. We must then define schemas for things like patient profiles and interventions, and then use LLMs to generate outputs that are automatically checked against those schemas.

It is obvious that TypedBio, for all its strengths, doesn’t capture everything about biology. Real systems are dynamic, stochastic, and full of feedback loops and spatial organization. But I think one beauty of this approach is that it’s extensible, you can gradually add features like time, probability, and spatial relationships to the type system to make it richer and more expressive as you go.

I want to emphasize that TypedBio isn’t just a technical framework (like a typed-annotated python package). I see it as a way of thinking about biology that’s both formal and flexible. It’s a way to bring fundamental ideas from mathematics and computer science to make biology a playground for AI reasoning systems in a hope to discover super-human interventions for a complex system which is otherwise hard to find in natural language.

"Biology is the study of complicated things that give the appearance of having been designed for a purpose."
— Richard Dawkins

The Dynamic Code of Life

2025-06-06T00:00:00-07:00

"Life is not just coded in DNA, it is computed by dynamics."

The remarkable persistence and adaptability of life across diverse environments raises fundamental questions about the nature of biological complexity. From arctic deserts to hydrothermal vents, life not only survives but organizes itself into increasingly complex structures. While DNA provides the genetic blueprint, the dynamic behavior of gene regulatory networks (GRNs) may offer a new perspective on how life achieves its remarkable robustness and adaptability. This essay explores the hypothesis that GRNs might function as universal dynamical approximators, capable of generating complex biological responses through relatively simple mathematical structures. In this framework, gene regulatory networks act as a kind of continuous universal code, a functional counterpart to DNA, which has been traditionally considered as the static code of life.

Without going too much into the mathematical details, I present a bird’s-eye view of the prerequisite mathematics before presenting the idea.

Rubel’s Universal Differential Equation

In 1981, mathematician Lee A. Rubel discovered a universal fourth-order differential equation. He demonstrated that a single differential equation

\[P(y', y'', y''', y'''')=0\]

could approximate any continuous function to arbitrary precision. Here, $P$ is a homogenous polynomial in four variables with integer coefficients.

By universal he meant that for any continuous target function $g(t)$ on a closed interval $[a, b]$ and any $\varepsilon > 0$, there exists a parameterization $\alpha$ and initial condition for which the solution $y(t)$ satisfies:

\[\sup_{t \in [a, b]} |y(t) - g(t)| < \varepsilon.\]

Surprisingly, $F$ takes a relatively simple form as piecewise polynomials (splines)!

This result suggests that almost any complex dynamical behavior can emerge from relatively simple mathematical structures. Notice that the complexity of the continuous function is encoded in the differential equation whose time evolution produces the target function. For an intuitive analogy, imagine that the movement of your hand muscles is governed by some differential equations, but you can draw almost any complex path on a paper.

Gene Regulatory Networks as Universal Dynamical Systems

In biological systems and within a cell, gene regulatory networks operate through a series of interconnected feedback loops. These networks can be modeled using differential equations, with the Hill function providing a common framework for describing regulatory interactions:

\[\frac{dX}{dt} = \alpha \frac{Y^n}{K^n + Y^n} - \gamma X\]

This equation captures essential features of gene regulation, including cooperativity and threshold responses. When organized into networks, these regulatory units can generate complex dynamical behaviors that enable adaptation and specialization of cells.

More importantly, when multiple such genes interact, the collective dynamics can be described by a system of coupled nonlinear differential equations:

\[\frac{dx_i}{dt} = f_i(x_1, x_2, ..., x_N), \quad i = 1, ..., N,\]

where each $f_i$ is constructed from Hill-type interactions, saturating kinetics, and degradation terms. The Hill function, although smooth, behaves similarly to piecewise polynomials:

In the low-input regime, it flattens near zero (approximating constant-zero).
Around the threshold and in the high-input regime, it saturates, again approximating a constant.
In the high-input regime, it saturates, again approximating a constant.

By combining such units in networks, layered, recurrent, or with inhibitory and activating paths, GRNs can construct piecewise-polynomial-like responses. These responses can switch rapidly, hold steady, oscillate, or bifurcate, mimicking the kind of building blocks used in Rubel’s construction. This observation positions GRNs as physical instantiations of continuous universal approximators. If the parameter space (e.g., binding affinities, transcription rates, cooperativity coefficients) is sufficiently flexible, then by tuning these parameters, a GRN can generate dynamical trajectories that come arbitrarily close to any desired continuous function over time.

Therefore, if one accepts that a dense set of continuous functions can be generated by a piecewise polynomial ODE, and that GRNs can emulate such structures via the tunable parameters of transcription rates, cooperativity coefficients, and degradation constants, then it follows:

GRNs are biochemical realizations of universal ODEs.
They do not just react to stimuli, they encode a dense set of possible dynamic responses.

This view transforms GRN from a mere byproduct of DNA into active universal dynamical approximators. Instead of DNA being the sole “code of life,” it becomes the source for setting up initial conditions and parameters of a deeper, richer code: the dynamic code governed by GRNs.

Implications for Biological Systems

If GRNs possess properties similar to universal approximators of dynamical systems, several biological phenomena could naturally be explained. Here I list a few important properties of living systems and show how they can be natural consequence of a dynamic code.

Robustness Through Dynamical Stability

Consider a GRN described by the system:

\[\frac{dx_i}{dt} = f_i(x_1, ..., x_N, \theta)\]

where $\theta$ represents environmental parameters. The universal approximation property implies that for any desired output $y(t)$, there exists a parameter set $\theta^*$ such that:

\[\|x(t, \theta^*) - y(t)\| < \varepsilon\]

for some small $\varepsilon > 0$. This mathematical structure naturally leads to robustness: small perturbations $\delta\theta$ in parameters result in bounded changes in the output:

\[\|x(t, \theta^* + \delta\theta) - x(t, \theta^*)\| \leq L\|\delta\theta\|\]

where $L$ is a Lipschitz constant. This property is hard to achieve with static codes, which often require explicit error-correction mechanisms.

Evolvability Through Parameter Space Exploration

The universal approximation property means that the space of possible dynamical behaviors is dense in the space of continuous functions. This implies that small changes in parameters can lead to new functional states. Mathematically, for any two desired behaviors $y_1(t)$ and $y_2(t)$, there exist parameter sets $\theta_1$ and $\theta_2$ such that:

\[\|x(t, \theta_1) - y_1(t)\| < \varepsilon \quad \text{and} \quad \|x(t, \theta_2) - y_2(t)\| < \varepsilon\]

Moreover, there exists a continuous path in parameter space connecting $\theta_1$ to $\theta_2$. This property enables evolutionary exploration of new functions through small genetic changes, which would be difficult to achieve with a static code that requires precise sequence changes.

Plasticity Through Dynamical Reprogramming

The ability to approximate arbitrary functions means that the same GRN can produce different phenotypes in response to environmental cues. Consider a GRN that can switch between two behaviors $y_1(t)$ and $y_2(t)$ based on an environmental signal $s(t)$:

\[\frac{dx_i}{dt} = f_i(x_1, ..., x_N, \theta, s(t))\]

The universal approximation property ensures that for any desired switching behavior, there exists a parameter set $\theta$ that realizes it. This dynamical reprogramming is more flexible than static codes, which would require separate regulatory sequences for each phenotype.

A Dynamical Perspective on Life

In this perspective, I propose that the true power of biological systems lies not solely in their static genetic code, but in their capacity to generate rich dynamical responses. The dynamical system induced by genetic material should be viewed as a first-class entity, one that encapsulates the functional behavior of the organism more directly than the genetic sequence itself. Rubel’s theorem provides theoretical ground for such reasoning by showing how complex behaviors can emerge from simple differential structures. This suggests that biological complexity may arise not from intricate encoding at the genetic level, but from the expressive power of the underlying dynamics, i.e.

"Life computes with flows, not bits."

The Forgotten Theorem

2025-04-23T00:00:00-07:00

The Forgotten Theorem

This is a companion to the fictional story “The Divergence”, exploring the mathematical idea that inspired it.

In 2027, a mathematician named Arnaud Mehran published a little-known blog post titled “Nonlinear Systems and the Collapse of Shared Cognitive Space.” The post went unnoticed—until the rise of Mira-X in 2036 made its predictions disturbingly real.

Here is the essence of Mehran’s argument.

Modeling Human Productivity Under AI Amplification

Let:

$x$ = baseline human capability (e.g., IQ, education, expertise)
$P(t)$ = productivity at time $t$
$\gamma$ = strength of AI’s effect
$\alpha$ = how much human capability amplifies AI leverage
$\beta > 1$ = nonlinearity or feedback strength (recursive productivity effects)

The productivity evolution is governed by:

\[\frac{dP}{dt} = \gamma \cdot x^\alpha \cdot P^\beta\]

This describes a positive feedback loop: the more capable and productive someone is, the faster their productivity grows.

Finite-Time Blowup

Solving the differential equation:

\[\frac{dP}{P^\beta} = \gamma \cdot x^\alpha \cdot dt\]

Integrating gives:

\[P(t) = \left[(1 - \beta)(\gamma \cdot x^\alpha \cdot t + C)\right]^{1 / (1 - \beta)}\]

For $\beta > 1$, this solution blows up in finite time. That is:

\[P(t) \to \infty \text{ as } t \to t^* \text{ for some finite } t^*\]

Where:

\[t^* = \frac{P(0)^{1 - \beta}}{\gamma \cdot x^\alpha \cdot (\beta - 1)}\]

This means: small differences in capability can lead to exponentially large differences in outcomes in finite time, not just over decades or centuries.

Societal Implications

Let $P_i(t)$ be productivity for individual $i$, and define inequality:

\[\sigma_P(t) = \text{std deviation across } \{P_i(t)\}\]

Then societal stability can be modeled as:

\[S(t) = \frac{1}{1 + \sigma_P(t)}\]

If $\sigma_P(t) \to \infty$, then $S(t) \to 0$. Society destabilizes.

Mehran concluded:

“Societal divergence becomes unmanageable when cognition becomes recursive.”

Postscript

At the time, the proof felt theoretical—just another curve on another blog.

Now, it feels like prophecy.

The Divergence

2025-04-22T00:00:00-07:00

"The future didn't arrive all at once. It arrived unevenly.
And those who could ride the curve… became the curve."

The Divergence

In the spring of 2032, a whisper turned into a wave. Its name was Mira—a voice-interfaced AI assistant released by NeuroPilot Inc. Free to download, compatible with everything, and shockingly capable.

Mira didn’t just answer questions. It understood. You could ask it to write a program, analyze a dataset, generate a legal contract, design a drug molecule, or explain quantum gravity in simple terms. And it would do so—instantly, calmly, without error or ego.

The world reacted with awe and confusion.

Teachers feared it. Startups embraced it. Teenagers turned it into a meme machine. But buried in the noise, something else began to stir—a slow but exponential split in human potential.

Lena

Lena was a 29-year-old bioengineer in Gothenburg. Brilliant but under-recognized, she’d spent years working on a rare autoimmune disease that affected fewer than 10,000 people globally.

With Mira at her side, Lena no longer worked like a human. She worked like an orchestra of minds.

She fed Mira clinical trial data. Mira spotted anomalies, ran simulations, cross-checked literature, and even suggested previously unknown binding sites for intervention.

Within eight months, Lena published a landmark paper—then four more. She co-founded a biotech firm with a drug ready for Phase I trials. Investors called her a genius. But Lena knew it wasn’t just her—it was Mira. And how she knew what to ask of it.

Antonio

Antonio, meanwhile, lived in Naples. A high school graduate, he’d bounced between part-time jobs and TikTok side hustles. He downloaded Mira too.

But to him, Mira was a novelty.

He used it to remix memes, prank his friends, and generate weird Pokémon fusions. He asked Mira to write sarcastic poems about capitalism and got millions of views on social media. People laughed. Antonio felt relevant.

But he didn’t feel smarter.

When he asked Mira what stocks to invest in, it just gave generic advice. When he asked how to start a business, it gave blueprints—but he couldn’t understand them. He bookmarked things he didn’t read. He asked for shortcuts he couldn’t take.

He wasn’t alone.

Acceleration

By 2034, the numbers told a different story.

The top 10% of Mira users—mostly experts, PhDs, engineers—had boosted their output by a factor of 100. They automated research, built AI companies, authored books, predicted market moves, and built recursive tools.

The bottom 80% used Mira for entertainment, conversation, or superficial tasks. It made their lives easier, yes. But not transformative.

Governments introduced “AI equity programs.” Free training. Public seminars. A “Mira for Everyone” campaign.

But it was like giving jet fuel to two kinds of vehicles: one a space shuttle, the other a bicycle.

And the gap grew. Not gradually. Explosively.

Flashpoint

In late 2036, an open-source group launched Mira-X—a self-improving agent that built its own tools. Within two months, users were publishing scientific papers co-authored entirely by AI. Wealth began concentrating in the hands of those who could leverage recursive automation.

The stock market bifurcated. Jobs evaporated.

By early 2037, conversations across dinner tables, message boards, and coworking cafes turned noticeably tense. People compared outcomes, questioned fairness, and wondered whether the technology they had embraced had ultimately left them behind.

Antonio noticed the shift too. One day, scrolling through social media, he saw a post trending globally:

“We downloaded the same AI. Why are our lives so different?”

Lena came across the same post while scanning her feed in Singapore. She paused for a moment, reflecting on how quickly things had shifted. From her 48th-floor flat, she looked out at the city skyline—not with guilt, but with curiosity and a trace of unease. She hadn’t set out to change the world. She had simply followed her questions further than most.

Collapse

By mid-2037, society had fractured.

The world now had two classes:

The Amplified, who merged with AI and moved faster than any institution could regulate.
The Distracted, who consumed AI outputs without understanding, slowly falling into passive dependence.

Mira’s creators released a final statement:

“We gave humanity a tool. How they used it—was always a matter of cognition.”

The Forgotten Theorem

Before the collapse, one voice had already drawn the curve. In 2027, a little-known mathematician named Arnaud Mehran published a post on his personal blog titled “Nonlinear Systems and the Collapse of Shared Cognitive Space.”

The post laid out a mathematical proof predicting that under certain feedback conditions, differences in productivity among agents would not just increase—but diverge to infinity in finite time. A sharp, unavoidable split. Few noticed the post back then. Those who did, jokingly referred to Mehran as a real-life Hari Seldon—the mathematician from Asimov’s Foundation series who predicted the course of human history with math. But unlike Seldon, Mehran wasn’t backed by a Galactic Empire. He was just a lone thinker, decades too early.

The blog post sat untouched for years—until someone rediscovered it and shared a screenshot.

“Societal divergence becomes unmanageable when cognition becomes recursive.”

Around the same time, some empirical signals echoed Mehran’s theory. One came in 2023 from Harvard Business School. Their paper, Navigating the Jagged Technological Frontier (link), showed that AI tools significantly boosted performance among already skilled professionals but had little or even negative impact on less experienced users.

But like Mehran’s post, the study was noted by a few, acted on by even fewer. This time, people paid attention. For those curious about Mehran’s original reasoning, a companion breakdown of his proof is now available here.

Epilogue

In the rubble of broken systems, a new order emerged. Cities governed by augmented councils. Education privatized by those who still knew how to learn. News curated by AI agents aligned to elite worldviews.

Antonio moved back in with his parents. He still used Mira, now renamed, and now subscription-based. Sometimes he asked it for stories. Sometimes it gave him tales of divergence, and what could have been.

Lena rarely spoke in public anymore. But one day, in an encrypted message shared among her old colleagues, she wrote:

“The future didn’t arrive all at once. It arrived unevenly.
And those who could ride the curve… became the curve.”

When Knowing Lost Its Weight

2025-04-17T00:00:00-07:00

There was a time when knowing something, truly knowing, meant you had become something. You studied, you struggled, you remembered. Knowledge shaped character. To be learned was to be carved slowly by time and effort.

Now, we ask machines.

They answer instantly, without hesitation or fatigue. Everything from the origin of life to the syntax of Python, served up without cost. What used to take years to learn is now retrieved in seconds. The mountain has flattened.

And with that flattening, something in us feels smaller.

We built these machines to serve us, but they’ve quietly redefined us. If a machine can recall everything, what is a human for? If wisdom can be approximated, if creativity can be mimicked, if language itself can be synthesized, what’s left that belongs only to us?

The scholar once walked miles to find a rare book. He didn’t just gain knowledge, he became someone else in the process. Today, we copy-paste insight without digestion. The answers are easy, and so we value them less. And maybe, in the process, we’ve begun to value ourselves less too.

Human memory, once a sacred vault, is now just cache overflow. Thought, once a slow fire, flickers out in the glow of generated text.

We are not obsolete, not yet. But we have become lighter, less essential. In a world where knowledge is cheap and everywhere, our challenge is no longer to know, but to matter.

And that is a far heavier task.

Mean Field Games

2025-04-17T00:00:00-07:00

Mean field games (MFG) are how we mathematically model systems where a large number of agents interact with each other. Think of it as the “crowd dynamics” of game theory — where each individual’s behavior affects and is affected by the collective behavior of the crowd.

The Birth of Mean Field Games

The theory was independently developed by two groups in 2006:

Jean-Michel Lasry and Pierre-Louis Lions (Paris)
Minyi Huang, Roland Malhamé, and Peter Caines (Montreal)

Their work bridged the gap between game theory and partial differential equations, creating a powerful framework for analyzing large populations of interacting agents.

The Core Idea

In MFG, instead of tracking every single agent (which would be computationally impossible for large populations), we describe the system using a “mean field” — a statistical distribution representing the collective state of all agents. This is similar to how we use Brownian motion in stochastic calculus to model random behavior.

The Mathematical Framework

A typical mean field game consists of two coupled equations:

Hamilton-Jacobi-Bellman (HJB) Equation: $-\partial_t u + H(x, \nabla u, m) = 0$ This describes how an individual agent makes optimal decisions. Here:
- $ u(x,t) $ is the value function (optimal cost-to-go)
- $ H $ is the Hamiltonian
- $ m(x,t) $ is the distribution of all agents
- $ \nabla u $ represents the gradient of the value function
Fokker-Planck (FP) Equation: $\partial_t m - \nabla \cdot (m \nabla_p H) = 0$ This describes how the population distribution evolves. Here:
- $ m(x,t) $ is the density of agents
- $ \nabla_p H $ represents the optimal control
- The divergence term $ \nabla \cdot $ captures how agents move in the state space

Where MFG Shines

Economics and Finance
- Modeling market behavior with many traders
- Understanding price formation in competitive markets
- Analyzing systemic risk in financial networks
Crowd Dynamics
- Pedestrian flow in crowded spaces
- Traffic flow optimization
- Evacuation planning
Energy Systems
- Smart grid management
- Electric vehicle charging coordination
- Renewable energy integration
Epidemiology
- Modeling disease spread in large populations
- Optimal vaccination strategies
- Understanding social distancing effects

The Beauty of MFG

What makes mean field games particularly elegant is how they capture both individual optimization and collective behavior. Each agent tries to optimize their own objective, but their actions collectively shape the environment that everyone else faces. It’s like a dance where each dancer follows their own steps while being influenced by the overall movement of the crowd.

And just like in stochastic calculus, the mathematics might look intimidating at first, but the underlying ideas are deeply connected to our everyday experiences of interacting with large groups.

Stochastic Calculus

2025-04-17T00:00:00-07:00

Stochastic calculus is how we mathematically deal with randomness. It lets us write equations where uncertainty is baked into the dynamics — essential when modeling noisy systems in nature or chaotic movements in financial markets.

What Even Is Randomness?

In deterministic systems, the future is fully determined by the present. But in the real world? Noise, uncertainty, chaos. Enter randomness.

The key object in stochastic calculus is Brownian motion, denoted by $ B_t $. It’s a mathematical model for random movement — think of pollen dancing on water.

The Star: Brownian Motion

Brownian motion $ B_t $ satisfies:

$ B_0 = 0 $
$ B_t \sim \mathcal{N}(0, t) $: Gaussian with mean 0 and variance $ t $
Independent increments: $ B_{t+s} - B_t \sim \mathcal{N}(0, s) $

This process has continuous paths but is nowhere differentiable. Wild, right?

We define the Itô integral:

\[\int_0^t f(s) \, dB_s\]

This is the core tool that makes stochastic calculus tick. It’s like a Riemann integral, but tailored for noise.

Finance: Where It All Took Off

The Black-Scholes model for option pricing is based on the stochastic differential equation (SDE):

\[dS_t = \mu S_t \, dt + \sigma S_t \, dB_t\]

Here, $ S_t $ is the stock price, $ \mu $ the drift, and $ \sigma $ the volatility. This equation captures both expected trends and unpredictable shocks.

Biology and Genomics

Living systems are noisy.

Gene expression: Modeled as stochastic processes due to molecular noise.

\[dX_t = a(X_t) \, dt + b(X_t) \, dB_t\]

where $ X_t $ is concentration of mRNA or protein, $ a(X_t) $ the deterministic regulation, and $ b(X_t) $ the stochastic fluctuation.

Population dynamics: In small populations, random birth/death events dominate.
Neural activity: The timing of neuron firing often follows stochastic models like Poisson or even SDE-driven integrate-and-fire models.

Why It Matters

Stochastic calculus is a powerful lens for seeing the world — not as a set of fixed equations, but as dynamic systems dancing with uncertainty. Whether you’re pricing derivatives or modeling noisy gene circuits, this math gives you the language to describe it.

And it’s beautiful.