Cameron Harwick https://cameronharwick.com Veritas Pulchritudo Est Fri, 13 Mar 2026 03:08:59 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://cameronharwick.com/files/2015/08/cropped-icon-32x32.png Cameron Harwick https://cameronharwick.com 32 32 The Sabbath Was Made For Man: On Dynamic Stability as a Metaethical Criterion https://cameronharwick.com/writing/the-sabbath-was-made-for-man/ https://cameronharwick.com/writing/the-sabbath-was-made-for-man/#comments Mon, 02 Feb 2026 21:47:07 +0000 http://cameronharwick.com/?p=1270 Following Elon Musk’s interview on the Joe Rogan Podcast in February 2025, a meme went viral on X (formerly Twitter) juxtaposing a remark from that interview, that

The fundamental weakness of Western civilization is empathy.

with an apocryphal quote from Hannah Arendt,

The death of human empathy is one of the earliest and most telling signs of a culture about to fall into barbarism.

doubtless with the implication that Musk is insufficiently committed to empathy as a moral norm.

The basic argument of this paper is that to commit to a normative value, such as empathy, also commits one to maintaining the conditions of its continued existence. In conjunction with a dynamic where a maximalist commitment to that norm undermines its continued existence, resistance to maximalism should be understood as a defense of that norm, and not as inimical. There is, in other words, no conflict between Musk and the apocryphal Arendt in principle.

This paper advances an ontology of normativity grounded in its ability to solve cooperative dilemmas in human social life. This is well-trod ground with well-demarcated pitfalls. First, we explicitly distinguish Nash equilibrium from evolutionary stability as a solution concept, noting that many game-theoretic accounts of human moral life reduce to egoism (or at least stochastic egoism) for failure to do so. This is not a “debunking” of moral conviction. The other danger of such an approach is a genetic fallacy; in other words the naïve move from is (or, worse, was) to ought. Although (we will show) the content of moral norms cannot be derived from stability conditions (as the wide variety of human moral systems suggests), stability conditions can limit the space of permissible moral norms. We will show that affirming the reverse – that stability conditions do not bear on the validity of a moral norm – has absurd entailments.

Having established that stability considerations do limit the space of valid moral norms, we show just how strong this condition is. Far from a tautological endorsement of any existing moral norm, it can be shown that universalizing and maximalist moral philosophies – many of which have substantial purchase both in academic philosophy and in folk morality, such as Benatar’s antinatalism and Singer’s effective altruism – are ruled out. Indeed, at some level, any stable moral system must be parochial in the sense of proscribing cooperation with an outgroup, although this may be defined at different levels of abstraction. We conclude with a challenge to believers in modern moral ideologies such as liberalism, feminism, and environmentalism whose dynamic stability is in doubt: believers must seek the minimal changes to ensure their continued survival, and maximalists who resist these efforts should be understood, not as overzealous supporters, but as opponents.

What Is Normativity?

Normativity, which includes but is not limited to moral rules,1 is (1) a human behavioral trait, (2) consisting of coordinated assignment of normative valence to actions, (3) with the function of defining and solving social dilemmas in large groups.

While a comprehensive defense of this position against all alternatives is beyond the scope of the paper, it will be worth sketching what this claim implies – and what it does not imply – about the shape of human moral life.

Normativity is a central capacity in carving out the human evolutionary niche, which consists in large-scale cooperation in social dilemmas, i.e. the willingness to refrain from benefiting one’s self at the expense of the larger group.2 Social dilemmas in which humans cooperate despite the temptation to free-ride range all the way from cooperative hunting to modern corporations and political systems. They include the classic two-person prisoner’s dilemma, on which much ink has been spilled, but also larger (and more difficult) n-person dilemmas such as commons problems, public goods problems, collective action problems, and free-rider problems. Without normativity, which coordinates the willingness to refrain from defecting in such situations and to punish those who do, human society would not be possible – and indeed no other animal cooperates at a remotely similar scale with non-kin.

While it would be tempting to construct an argument directly from the phylogenetic history of human normativity, this would involve us in a genetic fallacy. Normativity encompasses the entirety of valenced actions. To explain the origin of normativity from the outside does not necessarily bear on the structure of normativity from inside a normative system. To claim (plausibly) that morality is “for” cooperation (e.g. Curry 2016; Curry et al. 2019) does not necessarily imply that cooperation – or anything else – is good within a normative system. Moral systems are not, in other words, self-justifying (Hayek 1988). In order to draw inferences, we would have to start within, with a normative claim that refers to something outside itself, and not without, with an empirical claim.

We will see, however, that the origin may have indirect relevance. We begin with two key propositions:

PROPOSITION 1. A moral rule assigns a valence to an action (the substance of the rule).

This assignment may be explicit (“It is bad to steal”) or implicit (“The natural environment is good” implies “it is bad to litter”). The process of moral rationalization (Weber [1956] 2019: 108) involves progressively replacing explicit action-valences with implicit reasons as the boundary conditions of explicit rules becomes too unwieldy for simple statement – that is, simple reasons can imply complex actions and generalize better to novel situations (cf. Harwick 2026a). But contra modern externalists, who raise this rationalization to a moral imperative itself by defining morality over world states, all normativity ultimately consists in valencing actions and not just states of the world.

PROPOSITION 2. A moral rule assigns positive valence to itself (the efficacy of the rule).

That is, a moral rule must, in addition to its substantive claim, also claim that it is good to follow the rule; it must vouch for itself. A proof by contradiction is easily sketched: a rule that assigned itself zero or negative valence – “it is bad to steal, but it is also bad or indifferent to think so” – would have no normative efficacy, and is nonsensical as a normative rule.3

A rule, of course, is abstract, and it is not immediately obvious what Proposition 2 means in practice. Normative rules are actually instantiated in human communities that follow them. The important implication of Proposition 2, then, is that a rule that assigns a positive valence to action A must also assign positive valence to the continued existence of a community of A-doers. This does not have to be the community itself, in terms of any specific thread of continuity. Although implications regarding parochialism will be discussed below, parochiality – or even basic self-interest – are not built into the assumptions.

Normative claims are invertible. That is, the assignment of positive valence to action A implies the assignment of negative valence to ¬A, and vice versa. If it is bad to steal, it is good to respect property. If it is good to be courageous, it is bad to be cowardly. And so on.

If this is the case, Propositions 1 and 2 imply:

LEMMA 1. To commit to a moral rule also commits one to the conditions of its persistence.

In other words, dynamic stability is an important metaethical obligation. To assume otherwise would run Propositions 1 and 2 into a contradiction. If action A is good, the continued existence of a community of A-doers is good, and the cessation of A-doers is bad. But if A is not compatible with the persistence of a community of A-doers, then a moral norm assigning positive valence to A is, ipso facto, a contradiction.

If Lemma 1 is true, there is a bridge from is to ought, not through the history of human normativity, but through its ongoing stability. Although an origin story may inform the conditions of ongoing stability, note that even if human normativity evolved entirely through drift – that is, with no bearing on human fitness at all – it would still be obligatory for humans, endowed with whatever normative sense they have, to act in accordance with its ongoing stability so far as possible.

But how wide a bridge is this? Nothing has been said about the content of moral norms. Proposition 1 does not restrict the assignment of valence at all, and no substantive content is derivable from Proposition 2. Instead, we should understand Lemma 1 as limiting the space of valid moral norms. Thus,

CONCLUSION 1. A moral rule that is not compatible with its own persistence cannot be valid or normatively binding.

The Stability of Human Normativity

Stated at this level of generality, Conclusion 1 can replace Kant’s (1785) Categorical Imperative – also intended as a formal limitation on the space of valid norms – that one should “act only according to that maxim whereby you can at the same time will that it should become a universal law”. The force is similar, in the sense of subjecting a rule to a generalizability standard. But the passing condition is not a transcendental standard of practical reason, which does not answer the question of why one should care about generalizability (or moral standards more generally), but a more objective robustness condition.

Conclusion 1 therefore invites a social-scientific consideration of what kinds of moral norms are indeed compatible with their own persistence. We can thus bring evolutionary game theory, the study of the stability of norms in populations, to bear on the question of human normativity.

It is important to distinguish this path to the application of game theory to human morality from other possible pathways, many of which have justifiably been rejected. In the first place, because our concern is dynamic stability and not individual rationality, we use evolutionary game theory, whose solution concept is the evolutionarily stable strategy (ESS). An ESS is a behavioral rule followed in a population that can outcompete “mutant” rules; that is, it constitutes a best response both to itself and to alternative strategies and is not vulnerable to “exploitation” by them. In a replicator dynamic model (Hofbauer & Sigmund 1998), behavioral rules that do better than average reproduce themselves and increase as a proportion of the population, and behavioral rules that do worse than average fail to reproduce themselves and decrease as a proportion of the population, possibly going extinct. Such models capture actual selection dynamics remarkably successfully. We may restate Conclusion 1:

CONCLUSION 1’. A moral rule is normatively binding only when it is an ESS.

This argument may be understood at several different levels: (1) individual rules (one may make a new year’s resolution), (2) bundles of rules (one may convert to a religion), or (3) normative wholes, encompassing the entire moral life of a community, however delineated (one may analyze the competition between different cultures). In addition to particular rules or ruleplexes at any of these three levels, it will also apply to (4) normativity in general, that is, to the human behavioral trait of coordinating normative assignments of actions at all.

Each of the above must be evolutionarily stable in order to be normatively binding. However, evolutionary stability is not global – that is, there exist no rules or sets of rules which are robust to invasion by all possible variants (Lorberbaum 1994; Skyrms 2010; Harwick 2026b), and for this reason the possibly tempting path to an evolutionary moral realism (Sternly & Fraser 2017) is closed. It may be the case, therefore, that a norm binding in ordinary times becomes permissibly suspended under conditions in which the continued existence of the community comes into conflict with the continued practice of that norm. Schmitt’s (1922) ‘state of exception’ is thus naturally integrated into this perspective: although the practical determination of such a state cannot be rule-bound for similar reasons, we have a reason in principle for admitting the possibility of such states, something that an axiomatic approach to morality does not provide. Thus,

LEMMA 2. No moral obligation is unconditionally binding, i.e. under every conceivable circumstance.

The domain of applicability may be larger or smaller (and it may be unconditional within that domain), a rule or complex of rules may be stable against a larger or smaller set of rules in competition, but it is never the entire space of possible human experience (Harwick 2026a).

The ESS is related to, but importantly different from, the more familiar solution concept in classical game theory, the Nash Equilibrium (NE). A NE consists of a set of behavioral rules (strategies) such that, given the strategies employed by other players in a game, no player can do better by switching strategies. “Do better” is defined in terms of rational self-interest using the homo æconomicus construct. Although there have been attempts to explain certain features of human moral life using NE, the concept is fundamentally tied to self-interest. Explanations of human moral behavior using NE must show that it can be reduced to “rational self-interest” – that is, there is no way to escape egoism on this assumption.4 It can also be shown that human-scale institutions are never a NE under reasonable assumptions (Harwick 2020). By NE, there is no way to reconcile two apparent facts about human moral life:

  1. Human morality genuinely countermands self-interest, and cannot be reduced to it (Bowles & Gintis 2005).
  2. Human moral communities exist and prosper.

ESS and NE are formally similar, and are even sometimes regarded as interchangeable. In well-mixed populations where every individual is just as likely to meet any other individual, the set of ESSs is just the subset of NEs that is locally robust to mutation. To directly pursue one’s own self-interest is the only behavioral rule that can be guaranteed to do no worse than average under these circumstances.

However, in assorted populations, where different “types” – those employing different behavioral rules, for example cooperators and defectors, or Christians and Muslims – can preferentially assort with each other, there are ESSs that are not NE. That is, strategies that countermand the self-interest of the individuals employing them can be stable in a population if and only if they preferentially benefit others employing the same strategy (Skyrms 2005; Bowles & Gintis 2010: ch. 4). There are various ways of ensuring assortativity – signaling, kin selection, multi-level selection (Sober & Wilson 1998), population viscosity (Taylor 1992) – but assortativity is a necessary feature of anything we would recognize as morality.

In short, human moral life must be understood, at both the general and particular levels, as a non-Nash ESS. This divergence is what creates an is-ought gap in the first place: one’s own biological fitness constitutes the is side of motivation (thus there is no gap for an egoist; what ought follows straightforwardly from what is). But the evolutionary stability of non-fitness-maximizing strategies implies that the coordination of valences with a moral community supersedes the correspondence of these valences with any objective fitness interest.

On the one hand, this provides us a way to reconcile the two facts of human moral life above. Moral rules present themselves to individuals as non-instrumental ends-in-themselves, and we do not have to “debunk” genuine moral feeling by reducing it to self-interest. On the other hand, normativity must be instrumental from the perspective of a moral community as a whole. The sabbath was made for man, not man for the sabbath. In conjunction with the argument of the previous section, this analysis reveals the conditions a moral system must satisfy.

PROPOSITION 3. Non-self-interested behavioral rules, such as human morality both in general and in particular, are only dynamically stable under assortativity.

CONCLUSION 2. All normatively binding moral rules must be parochial in the sense of directing altruism toward others following the same rule (Choi & Bowles 2007).

Conclusion 2 is both a limitation on the space of valid moral systems, and a claim about the ontology of human morality. It is almost axiomatic in Western moral philosophy that “accidents” – such as the community one belongs to – should not bear on the substance of one’s moral obligations; that to act “morally” entails transcending particularistic attachments to one’s own communities (e.g. Rawls 1971: 587). This is taken to extremes by “effective altruists” such as Singer (1981; 2015), who believe that the circle of moral concern should be not only maximally wide – possibly including many animals – but also maximally flat in the sense that one “should” not prefer cooperating with one’s own friends and family over a stranger on another continent, or even some consciousness-weighted quantity of shrimp (Fischer 2024), if the latter has greater need. Effective altruism is, in other words, in practice a moral demand for the abolition of assortativity – which per the above argument is, in the long run, a demand for the abolition of morality and altruism itself. It is not that effective altruism is overdemanding (as argued by e.g. Cullity [2004]) – moral communities very frequently have nominally obligatory but practically aspirational standards and tolerate some amount of hypocrisy – it is that it is self-undermining. An effective altruism that directs its cooperation primarily to non-effective-altruists, one that relies on the ephemeral persuasiveness of its universalistic norm among rationalists to spread, is one that must eventually extinguish itself.5

Conclusion 2, then, should be understood as refuting the Kant-Rawls-Singer view that morality entails taking the view of eternity. Instead, moral rules should be understood as constituting a normative community within which moral self-sacrifice can be appropriately – and stably – directed.

These moral communities are not necessarily monolithic. One can belong to nested or overlapping moral communities that direct varying levels and forms of cooperation. One shares a great deal of moral overlap with one’s family and one’s church, which can accordingly obligate intense and open-ended forms of cooperation (Iannacone 1992 is a classic model of intensely assortative religious groups). Firms and organizations, to the extent normal contractual mechanisms fail to align incentives sufficiently, also constitute moral communities with more or less clearly circumscribed domains of cooperation (Miller 1992 discusses “company culture” in these terms). One may share less normative common ground with one’s neighbors or countrymen, with correspondingly weaker (but not nonexistent) moral obligations. This moral overlap may (but does not have to be) be self-referential and self-constituting, in the case of loyalty to a community as such – and note that ethnocentrism or nationalism in this sense is a very small subset of properly parochial moral communities. In the limit, this is also a moral theory of war, a situation where irreconcilable moral differences nullify most or all moral obligations to another community. It is not a question of whether “human rights” exist in war, but a question of what obligations to members of another moral community are compatible with the survival of one’s own. There are situations – hopefully but not inevitably rare – where that set approaches null.

Nor is parochiality necessarily a virtue as such. Gross et al. (2025) note that parochiality at a small scale (e.g. nepotism or ethnocentrism) can undermine cooperation at large scales, and identification with larger moral communities (e.g. nationalism) often demands the renunciation of partiality at smaller scales. Universalism, in this respect, is the far end of this spectrum. The question of the appropriate scale of primary identification is, of course, itself subject to the requirement of dynamic stability: it may be the case that there exists a scale that persists more effectively than both larger and smaller scale communities, and it may be the case that the set of stable scales changes over time. Nationalism for example is a relatively recent scaling up of an important circle of obligations, and principled universalism more recent still.

Nihilism and Denialism

The universalist style of moral thinking is deeply rooted in Western moral thought. There are at least two tempting pitfalls to avoid in the above argument, both of which involve incompletely abjuring universalism.

The first temptation is to judge the above argument by universalist standards. The argument is immoral because it is not universalistic. This is the approach taken by, for example, Bruner (2021) and Gross et al. (2025), both of which make arguments very much along the lines of the present paper that assortativity and parochialism are necessary for cooperation, and then conclude that cooperation is problematic because is is parochial. Bruner (2021) concludes that “mechanisms [to improve altruism] open the door to conditional cooperation… This realization should… motivate scholars to investigate ways to promote cooperation that do not usher in tag-based [i.e. parochial] strategies.” Gross et al. (2021) similarly claim that “Group cooperation further necessitates defining who belongs to the group, fostering exclusion and intergroup conflict. Free-rider concerns fuel scapegoating and polarization.”

If parochial altruism is the only stable altruism, to stand in judgment of human cooperation on the grounds that it does not live up to the historically anomalous standards of modern universalistic sentiments is question-begging. Better instead to flip the modus ponens with the modus tollens and conclude that morality should not be dictated by these sentiments, lest we undermine the conditions for moral behavior entirely. We may justly find certain forms of discrimination, exclusion, and so on distasteful, which we may understand as an obligation to a larger-scale moral community overriding or circumscribing obligations to a smaller-scale moral community. But to start with the premise that discrimination and exclusion should be abjured altogether – particularly of those outside even the larger community6 – is again to demand the well-mixing of the human population, under which circumstance morality cannot exist at all.

The second temptation is to conclude that if a universal moral standard derivable from rational considerations and applicable to all people does not and cannot exist, there are no moral rules. This is to accept the universalistic standard for morality even while rejecting universalism itself. By contrast, if obligations within nested moral communities are what morality is, to say that moral claims obtain within a particular moral community does not diminish them, make them subjective, or prevent them from taking a truth value.

Language – itself a normative system of signs – is a useful model. The grammaticality of a sentence can take a definite truth value that is independent of any particular mind (and therefore objective from the perspective of any individual), even if it is not independent of all minds. “I is happy” is objectively ungrammatical, not because I derived a universal grammar from the rules of practical reason, but because I belong to a community of English speakers within which the sentence can be judged.7

Similarly, I can validly claim that nepotism is wrong, not because practical reason provides some objective standard by which to judge it, but because I belong to a liberal moral community where impersonal rules trump kin obligations under many circumstances. The fact that there exist communities where nepotism is obligatory does not make my claim any less valid, forceful, or binding on those with whom I have moral leverage, because my assertion is to my moral community and a claim upon it. I can even validly claim that a moral system in which nepotism is condemned is better than a moral system that does not do so – indeed, by Proposition 2, to claim that nepotism is wrong entails this secondary claim. And again, this validity – while local to a moral community – is not diminished and does not lose force by the fact that there exist moral communities within which people would make the opposite claim, so long as we do not illegitimately presuppose a universalist standard.

For the same reason, the stability consideration is not necessarily incompatible with other more substantive metaethical criteria. Just as I can make valid normative claims within a moral community, I can also make valid metaethical claims, such as the importance of harm or suffering, so long as these are understood as efforts to influence the self-conception of the moral community of academic philosophers and not as axioms of a logical system. Indeed because stability as such provides no guidance as to substantive moral content, it will be necessary to put such stakes in the ground, subject only to the constraint of dynamic stability.

A Challenge to Modern Moralities

Parochialism, in this broad sense, was taken for granted in practice until the development of both Continental and English Enlightenment moral philosophy. Many of our modern convictions – all the way from specific values like authenticity or the natural environment, to the broader universalizing or rationalistic “style” of moral reasoning – are relatively new in human history.

This is not, by itself, an argument against these modern concerns or styles. After all, monotheisms have been universalistic in principle but parochial in practice for millennia (and in this sense one must distinguish a universal invitation into a moral community, which is perfectly compatible with stability in principle, from the universalization of moral obligation, as with the Effective Altruists).8 The academic enterprise in which this paper is engaged presupposes a great deal of very modern normative understandings. However, there are reasons to fear that many specifically modern moral concerns are liable to fail the stability test. This should not be taken as a refutation of these moral concerns, but as a challenge to their believers – the present author among them – to seek and implement the minimal changes to ensure their stability.

Consider, for example, environmentalism, the assignment of positive valence to the natural environment. In terms of concrete actions, many – both academic and in popular conception – have taken this to imply antinatalism, the assignment of negative valence to having children (e.g. Conly 2015). To the extent that environmentalism spreads by vertical transmission, such a norm is obviously self-defeating. But to the extent that environmentalism spreads by horizontal transmission, or evangelism, it is not obvious that it is.

Let us distinguish, then, between moral commitments at varying depths. Explicit moral commitments we will call ideologies. Ideologies, including religions, are amenable to rapid horizontal spread. Environmentalism is one such ideology. It is tempting to think, therefore, that horizontal transmission is primary and the natalism question is irrelevant.

This would be a mistake. At a deeper level, there exist tacit, unexamined moral commitments; these we will call culture. For example, individualism and collectivism (Greif & Tabellini 2017), or WEIRDness (Henrich 2020; Harwick 2023) – the moral syndrome of Western modernity – are proper moral orientations in the sense of assigning valences to actions, even if these are mostly too deep to ever be articulated as such. Culture changes only on a very slow timescale, and rapid ideological upheaval can be compatible with (and even symptomatic of) cultural stability. Indeed, ideological sweep typically happens because of a complementarity between a new ideology and existing cultural values. Cultural change is mostly glacial, and predominantly transmitted vertically. Horizontal transmission of ideologies across cultures typically results in extremely different moral practices: the main fractures within Christianity and Islam, for example, largely (though not perfectly) fell along ethnic boundaries, not because religious disputes were “really” about ethnic or national interests, but because different ideologies will bear more or less complementarity with different cultures.

From this perspective, antinatalism can be seen to be self-defeating despite the fact of horizontal transmission. Explicit antinatalism is a moral position with appeal only to the exceptionally culturally WEIRD. If this is the case, particularly virulent forms of antinatalism, such as those following from environmentalism, can extinguish not only environmentalists themselves, but also (to the extent horizontal transmission is successful) the entire suite of cultural norms amenable to environmental concerns in the first place.9 Naturally, the same applies with even more force to explicit antinatalism (Benatar 2006), which however – fortunately – has less widespread purchase than environmentalism.

This argument is emphatically not a refutation of environmentalism, understood as a concern with the natural environment. Rather, it is a challenge: in addition to the direct obligation to stewardship of the natural environment, environmentalists must understand themselves as having an obligation to the vertical transmission of environmentalist norms and the cultural values that support them. Indeed, pronatalism is practically a necessary commitment for any moral community.

The same would apply to feminism, to the extent it devalues or discourages childbirth. To the extent there exists an appreciable fertility difference between Western-feminist and traditional cultures with no respect for women’s rights, the long-term future of gender equality is bleak. Again, this is emphatically not an argument against feminism, but a challenge to feminists, that believers in gender equality must provide for the future existence of feminists.

The same, again, applies to incidentally antinatalist ideologies or moral systems. Although liberalism does not specifically devalue childbirth as environmentalism and feminism tend to, within (and likely beyond) Western countries, more liberal parties have appreciably lower birthrates than more conservative parties (Fieder & Huber 2024). Across countries, modernization – the adoption of at least some liberal values, especially female education – portends precipitous declines in birthrates, a trend that has accelerated globally since 2008. It is notable that, with a single exception, the only regions in the world with above-replacement birthrates are appallingly illiberal in moral orientation. This raises the uncomfortable possibility that modernity as a moral syndrome may itself not be stable (Anomaly & Faria 2023).

In this case, one issue is that liberals have not typically conceived of themselves as a moral community at all. Philosophical defenses run from natural law (Finnis 1980), public reason (Gauss 2011), or some such basis that is, in theory if not in practice, perspicacious to all right-thinking (i.e. weird) people. Besides the high-profile foreign policy failures arising from the failure to conceive of liberalism as culturally bound, it has also led to a neglect of the interests of the liberal moral community as such. Reconceptualizing liberalism, not as the universal birthright of humanity or the inevitable conclusion of rational thinking, but as a particular moral community with an interest in its own continuation, may be important to ensure that it does indeed continue. If liberalism is indeed not stable as constituted, liberal commitments obligate adherents to at least seek out the minimal conceptual, organizational, or practical changes necessary to render it stable.

Conclusion

We return then to the example of empathy, a moral emotion directing self-sacrifice for the benefit of an ailing other. Let us accept the premise, with Arendt, that empathy is good. Then the argument of this paper is that we are required to ask, with Musk, what are the normative limits of empathy such that its stability is guaranteed? Is a particular rule for the expression of empathy compatible with the persistence of empathy in general, or must it inevitably lead to the overextension, exploitation, and extinction of empathic norms?

This latter question is, of course, a matter of social-scientific judgment. Musk’s particular claim may be factually right or wrong. But we do claim that the reflexive condemnation of the question is self-undermining and illegitimate. Indeed, for someone who values empathy – or freedom, equality, the natural environment, or anything else – there is hardly a more important question to be asked.

Footnotes

  1. Unlike some accounts that try to distinguish the ontology of normativity in general from morality in particular (e.g. Machery & Mallon 2010), we regard the boundaries of a “moral” domain within the domain of normative rules, to the extent one is distinguished, to be culturally defined. The paper will sometimes use ‘moral rules’ metonymously for normativity in general.
  2. While there are some examples of nonhuman animals rewarding or punishing certain behaviors of conspecifics (thus implicitly assigning those behaviors a valence) (Boehm 1999), and some minor examples of cultural transmission such as birdsong or tool use (Whiten 2019), we know of no other animal that transmits normative understandings in an open-ended way.
  3. While “noble lie” arguments have something like this structure, they are not – and indeed cannot be – moral or normative arguments, at least not for the intended believers of such lies. The lie itself, of course, will indeed usually be moral or normative, though the converse (that moral or normative claims are lies) is not true.
  4. There are signaling models where ignoring one’s own self-interest brings benefits sufficiently large to outweigh the potential costs. For example someone who keeps strict accounts of favors owed by friends will have fewer friends than someone who doesn’t “count the cost”, and the increase in the number of cooperative friends might outweigh the costs of exploitative friends. Thus even self-sacrifice can be “ultimately” reduced to narrow self-interest, at least probabilistically. Whether these models are stable over long enough periods against the exploitation of such signals is doubtful without additional assumptions, however (Harwick 2026b).
  5. This argument does not necessarily preclude the Effective Altruist concern with global welfare. What it does suggest is that, in order to accomplish its ends more effectively, a top priority of EA communities should be directing cooperation within EA communities. The 2022 controversy over the Effective Venture Foundation’s decision to purchase a manor house for its own events – with resources that could have purchased malarial bed nets for distant children – suggests that this concern is largely not on EA communities’ radar, although ‘longtermists’ (e.g. MacAskill 2022) are a partial exception.
  6. Popper’s (1945) famous ‘paradox of tolerance’ pertains to this point. The virtue of toleration is an obligation to the community sharing the value of tolerance – which may of course encompass many “thicker” communities – and does not sensibly apply across the boundary of the liberal community.
  7. It is also worth noting that a similar distinction obtains between the universal language faculty and particular learned languages. This is more than a suggestive parallel, as the culturally learned assignment of valence to actions likely rests on the symbolic capacity of a language-possessing species.
  8. The parable of the Good Samaritan is often taken today to entail a mandate for universal altruism (Francis 2020). Historically however, the Church has interpreted the parable as a special obligation to those one is physically near and able to help (e.g. Augustine 397, I.28).
  9. This does not necessarily entail the complete extinction, or even the quantitative diminution, of a weird population considered as a whole. Imagine a community with a distribution of commitment to modern weird cultural norms. If environmentalism preferentially “takes” among the upper tail, which then adopts antinatalism, the trait value of the entire population shifts away from weirdness, and therefore also propensity to care about the natural environment.
]]>
6
People Aren’t That Manipulable https://cameronharwick.com/writing/people-arent-that-manipulable/ https://cameronharwick.com/writing/people-arent-that-manipulable/#respond Fri, 23 Jan 2026 17:41:16 +0000 https://cameronharwick.com/?p=2315 In 1957, a proprietor of a movie theater claimed he could increase sales of Coca Cola by flashing the text “Drink Coca Cola!” too quickly for moviegoers to be consciously aware of the text, but subconsciously implanting the suggestion. Despite the marketing potential, this was later revealed to be false.

In the early 2010s, decades of research into social priming, the leveraging of subtle cues to unconsciously change the decision context later, was shown to be unreproducible and probably an artifact of motivated researchers with shoddy statistics. Social priming had been used to justify a great deal of social engineering, from small-scale “nudges” to Supreme Court decisions.

In 2018, reeling from the unexpected Brexit vote, the British political consulting firm Cambridge Analytica ignited a controversy after obtaining consumer marketing profiles from Facebook. Who knows what private information could be inferred about someone from their social media habits, and what nefarious causes people could be manipulated into supporting by someone with that data? In response, Facebook and other social media companies allowed users to see and edit their marketing profiles, which turned out to be less the panopticon of imagination and more just a scattershot of low-confidence and low-resolution affinity buckets.

Around the same time in the US, reeling from the unexpected Trump election, attention turned to the Youtube algorithm and the “rabbit hole” dynamic, where the recommendation algorithm could amplify an innocuous click into increasingly politically extreme territory. This spawned a cottage industry of ‘misinformation studies’, i.e. narrative control, an industry later deployed in service of the Covid regime. Covid censorship turned out to be an enormous political debacle that poisoned decades of progress on vaccination, and later research found that the rabbit hole dynamic was never really that strong.

In 2002, AI alarmist Eliezer Yudkowsky published a thought experiment where an AI intelligent enough to convince you of anything was trapped in a box. He suggested that “a [superintelligent AI] can take over a human mind through a text-only terminal,” and convince you to let it out to (what else) devastate the earth. In 2022, ChatGPT’s remarkable facility with text renewed interest in this thought experiment along with a wave of AI “doomerism”. Isolated cases of AI psychosis began to appear, but it appears more likely that AI is not causal, but simply a new manifestation for previously existing psychoses.


What all these cases have in common is:

  1. There are two parties, one of whom wants something from the other. James Vicary wants me to buy a coke from him. A populist Youtuber wants the attention of an audience. The AI wants out of the box.
  2. The claim that there’s some set of cues, an incantation, that lets one party avoid facing the opposing interests directly, and manipulate the other party into doing what it wants.

And sometimes – especially in the middle three cases –

  1. The implication that, if this power of manipulation can be used for evil, why not use it for good?

No doubt this reasoning got uptake in large part because it flatters tastemakers in more ways than one. First, the hoi polloi are those susceptible to this sort of manipulation, which you – by knowing this argument – stand above. Second, (3) implies policy is not a matter of democratic deliberation (which you might lose), but a rather easier technocratic matter of engineering consensus.

But even serious intellectual disciplines got taken in. Behavioral economics, driven by dissatisfaction with the rational actor model, was an influential fad in the early 2000s, and 2002 Economics Nobelist Daniel Kahneman’s enormously popular book Thinking, Fast and Slow drew on a great deal of priming research that he later admitted he should have been more skeptical of.

Evolutionary psychology, similarly, leaned heavily on priming research as data to be explained. Why would humans respond to cues in these sorts of predictable ways? — maybe we can explain that by asking what this cue would have indicated in the ancestral environment. And indeed a fixed cue-response connection is how behavior is typically modeled in evolutionary ecology.

I highlight these two fields in particular because they should have known better.


Both economics and evolutionary ecology have equilibrium as a central concept. It’s all well and good to explain things as they are, but suppose we play the tape forward. We’d like to know, are things likely to remain as they are, or is there reason to expect they have to change?

The two premises above set up what’s called, in both fields, a signaling game. Alice has a choice of messages, or signals, to send. Bob has a choice of responses, and can condition his response – or not – on the message Alice sends.

Then we play the tape – we solve for equilibrium. Suppose Bob plays a compassionate strategy of helping whenever Alice says “Wolf” with alarm, or “Mom” plaintively. This clearly creates an opportunity for Alice to take advantage of Bob.

“A wonderful, magical word…”

Then we play the tape again. Bob can respond by switching to a two-strikes strategy: if Alice cried wolf a few times before and there was no wolf, ignore the signal next time. Or he could switch to a trust-but-verify strategy. But what if reputations aren’t available or reliable? What if the danger isn’t verifiable until long after the fact? If Bob isn’t able to protect himself from getting taken advantage of, his best strategy is just don’t listen to the signals. Maybe Bob learns, or maybe Bobs go extinct from stubbornness or failure to adapt, but either way the end result is that the signal is not listened to in equilibrium.

This is what makes human language such a remarkable evolutionary achievement. In order for it to be in my interest to understand the words you speak, I have to trust you not to take advantage of me. Chimps, for example – despite having spatial intelligence comparable to humans – have no language because they will not trust the voluntary vocalizations of other chimps. Language is not a triumph of human intelligence, but of human trust. (And, of course, verification.)


My recent paper “Strategies Are Not Algorithms” uses this logic to think about how minds translate cues and perceptions into concepts and actions. A world in which minds responded to cues in the mechanical way assumed by priming theory, or subliminal messaging, would be one in which signals are reliably exploitable. And a world in which signals are reliably exploitable is one in which either (1) people put up defenses over time, or (2) communication in general is not possible.

On the one hand, the paper argues that since language does exist, we have to explain how signals can be ensured to be reliable. For this reason, a simple cue-response mechanism cannot be an accurate description of how humans make decisions. If priming is real, language is not possible. It must be the case that the way I translate cues into categories (for example determining whether to trust a car salesman) cannot be straightforward, predictable, or rigid: it must be opaque, discontinuous, and changeable, because otherwise I would be too exploitable to make it worthwhile to condition my behavior on the language of others at all.

If this is true, worries about misinformation in the age of AI are likely overblown, or at least misplaced. There simply are not enough reliable contact points between perception and action for people to be manipulated at scale like that, at least not for very long, and manipulated people – whether at their expense or for their ostensible good – will never stay manipulated forever.

“Fool me twice, well… uh… you fool me, can’t get fooled again.”


On the other hand, this logic can also predict how attempts at manipulation have to play out. It applies, therefore, not just to the populists ostensibly manipulating the masses, but also the stewards of the polity who would assert control over discourse in defense. In the first place, it suggests that “misinformation” is a misdiagnosis. Populists and charlatans can and do deceive people about factual matters. But the appeal and stickiness of populism cannot be explained this way. People are rarely reliably deceived about their own, or their coalition’s, interests.

Kelsey Piper recently remarked that the left lies by misrepresenting evidence, and the right lies by ignoring evidence. More than just a pithy taxonomy, this is the predictable equilibrium outcome of a situation where the right perceives the terms of discourse to be set in a way adverse to its own interests. Nor is this an unreasonable perception. Consider the mystical power of redefinition in progressive praxis: “Love is love.” “Childcare is infrastructure.” “Trans women are women.” If the evidence is misrepresented, if the categories of discourse themselves are liable to be redrawn in a hostile manner, anti-intellectualism – rejecting the informativeness of signals perceived to be manipulative – is the only viable response.

What this suggests is that reacting to populism by doubling down on misinformation control is just throwing gas on the fire, leaning harder into the very conditions that created the populist reaction. In the limit, it’s not implausible that right and left could end up speaking entirely different languages – a process not without precedent. There are certainly technical questions that can be decided by experts once basic agreement on values is established; central bank independence for example is justified on these grounds. But on questions of basic values, there is no getting around the process of political deliberation and confronting those differences head on. To paper over these differences by suggesting political opponents are deluded as to their true interests, and nudging them into consensus through narrative control, can only end in a fractured polity speaking different languages.

Footnotes

  1. Unlike some accounts that try to distinguish the ontology of normativity in general from morality in particular (e.g. Machery & Mallon 2010), we regard the boundaries of a “moral” domain within the domain of normative rules, to the extent one is distinguished, to be culturally defined. The paper will sometimes use ‘moral rules’ metonymously for normativity in general.
  2. While there are some examples of nonhuman animals rewarding or punishing certain behaviors of conspecifics (thus implicitly assigning those behaviors a valence) (Boehm 1999), and some minor examples of cultural transmission such as birdsong or tool use (Whiten 2019), we know of no other animal that transmits normative understandings in an open-ended way.
  3. While “noble lie” arguments have something like this structure, they are not – and indeed cannot be – moral or normative arguments, at least not for the intended believers of such lies. The lie itself, of course, will indeed usually be moral or normative, though the converse (that moral or normative claims are lies) is not true.
  4. There are signaling models where ignoring one’s own self-interest brings benefits sufficiently large to outweigh the potential costs. For example someone who keeps strict accounts of favors owed by friends will have fewer friends than someone who doesn’t “count the cost”, and the increase in the number of cooperative friends might outweigh the costs of exploitative friends. Thus even self-sacrifice can be “ultimately” reduced to narrow self-interest, at least probabilistically. Whether these models are stable over long enough periods against the exploitation of such signals is doubtful without additional assumptions, however (Harwick 2026b).
  5. This argument does not necessarily preclude the Effective Altruist concern with global welfare. What it does suggest is that, in order to accomplish its ends more effectively, a top priority of EA communities should be directing cooperation within EA communities. The 2022 controversy over the Effective Venture Foundation’s decision to purchase a manor house for its own events – with resources that could have purchased malarial bed nets for distant children – suggests that this concern is largely not on EA communities’ radar, although ‘longtermists’ (e.g. MacAskill 2022) are a partial exception.
  6. Popper’s (1945) famous ‘paradox of tolerance’ pertains to this point. The virtue of toleration is an obligation to the community sharing the value of tolerance – which may of course encompass many “thicker” communities – and does not sensibly apply across the boundary of the liberal community.
  7. It is also worth noting that a similar distinction obtains between the universal language faculty and particular learned languages. This is more than a suggestive parallel, as the culturally learned assignment of valence to actions likely rests on the symbolic capacity of a language-possessing species.
  8. The parable of the Good Samaritan is often taken today to entail a mandate for universal altruism (Francis 2020). Historically however, the Church has interpreted the parable as a special obligation to those one is physically near and able to help (e.g. Augustine 397, I.28).
  9. This does not necessarily entail the complete extinction, or even the quantitative diminution, of a weird population considered as a whole. Imagine a community with a distribution of commitment to modern weird cultural norms. If environmentalism preferentially “takes” among the upper tail, which then adopts antinatalism, the trait value of the entire population shifts away from weirdness, and therefore also propensity to care about the natural environment.
]]>
0
Helipad: A Framework for Agent-Based Modeling in Python https://cameronharwick.com/writing/helipad/ https://cameronharwick.com/writing/helipad/#respond Sun, 02 Nov 2025 00:51:56 +0000 https://cameronharwick.com/?p=2245 Journal of Open Research Software, Forthcoming.

Agent-based modeling is an alternative to traditional analytical modeling that simulates interactions among agents algorithmically (see Bonabeau 2002; Epstein & Axtell 1996 for overviews). It is particularly valuable for modeling dynamic systems that are difficult to describe with a closed-form analytical solution. In an agent-based model, discrete agents are programmed to interact under conditions that simulate the environment in question, and carry their state with them. Agent behavior can be directly specified in an open-ended way, allowing models to be interpreted much more easily than highly stylized analytical models where agent state can be specified only at an aggregate level. In addition, equilibrium can emerge from such a model – or not – without building the equilibrium into the assumptions of the model (Arthur 2015, ch. 1).

Agent-based modeling necessarily involves programming, and there are numerous frameworks available in a variety of languages. NetLogo is one popular integrated development environment (IDE) that includes a code editor, a proprietary language, and extensive visualization tools, especially for spatial models (Banos et al. 2015; Wilensky & Rand 2015; Railsback and Grimm 2012). Its popularity is due to its shallow learning curve and its integrated environment: very little setup is necessary, models can be easily packaged, and visualizations are simple to set up.

Nevertheless, NetLogo is limited in important ways. First, its language is only object-oriented in a very restricted sense, limiting some of the advantages of agent-based models that involve the states of individual agents.10 And second, while its self-containedness is an advantage in some respects, it also limits the ability to interact with outside libraries and to use code written for more traditional object-oriented languages.

Agent-based modeling frameworks in other languages on the other hand – for example, in Python (Kazil, Masad, & Crooks 2020), Java (Luke et al., 2018), MatLab, or Mathematica – have the potential to be far more powerful with the ability to draw on general language features, outside libraries, and wider communities of users. However, they are not integrated IDEs: they tend to be skeletal, to provide a basic structure with some visualization capabilities, but generally require a great deal more setup and boilerplate – especially for visualization – than an IDE like NetLogo.

This paper introduces a new agent-based modeling framework for Python, Helipad, to fill this gap. Helipad is a framework rather than an IDE (although the distinction is blurrier when used in a Jupyter notebook), but it has the goal of reducing boilerplate to a minimum and allowing models to be built, tested, and visualized in incremental steps, an important trait for rapid debugging (Gilbert and Troitzch 2005, 21). In the following section we introduce Helipad’s general architecture and its array of modeling capabilities.11 Section 2 provides an overview of various sample models that have been written to demonstrate Helipad’s capabilities. The paper concludes with suggestions for future applications.

Helipad’s Architecture

Prerequisites

Helipad runs cross-platform on Python 3.9 or higher. It has minimal dependencies, requiring only Matplotlib (Hunter 2007) and NetworkX (Hagberg et al 2008) for visualizations, and Pandas (McKinney 2010) for data collection, both of which in turn rely on Numpy (Harris et al. 2020). Shapely is optional at install time, but required for geospatial models.

Helipad can be run “headlessly” or with a GUI, which consists of a control panel and/or a visualization window. The GUI can be run in two different environments. First, a Helipad model can be run directly as a .py file, using Tkinter to provide a cross-platform windowed application interface (Fig. 1).

Fig 1. An Axelrod tournament model (Axelrod 1980) running in Helipad’s Tkinter frontend.

Helipad can also be run in a Jupyter notebook (Fig. 3), a format that allows code and exposition to be mixed together and run in-browser (Kluyver et al. 2016) in an environment very nearly approaching an IDE. Model code and features are identical in both frontends. Doing so requires, in addition to Jupyter Lab, the Ipywidgets and Ipympl libraries.

Helipad is available on PyPi.org, and is most easily installed using pip install helipad from the command line. It can also be installed with Conda using conda install -c charwick helipad.

Hooks

There are two distinct strategies that can govern the relationship between user code and the code of an agent-based modeling framework (Fig. 2). The two are not entirely mutually exclusive, but in practice, frameworks will hew toward one or the other.

  1. An imperative strategy.Many frameworks are simply a collection of functions and classes that must be called or subclassed explicitly from a user-specified loop. The advantage of this strategy is that it provides explicit and precise control over every aspect of a model’s runtime. The disadvantage is that a great deal of boilerplate must be written in each model.
  2. A hook strategy. Helipad, by contrast, incorporates the boilerplate and takes care of the looping, allowing user code to be inserted in specific places in the model’s runtime through hooks. The advantage of this strategy is that it allows a logical organization of code by topic and minimal boilerplate code. The disadvantage is that the framework makes certain assumptions about model structure, though there are ways to mitigate this disadvantage.
Fig 2. A representation of the relationship between user code and hook code under imperative and hook strategies.

Helipad uses a hook strategy, and minimizes the disadvantages by providing direct access to the model class in most hooks, allowing as much fine-grained control over the model’s runtime as would be possible with an imperative framework.

Helipad provides a loop structure for a model, into the elements of which user code can be inserted via hooks. Some of these will be critical to a model’s functioning (e.g. the agents’ step function); others offer low-level control over various aspects of the model (e.g. the cpanel hooks).

The @heli.hook decorator inserts a user function into these specified points in the model’s runtime. For example, the following agent step function instructs agents to work in the first stage of each period, and consume their product in the second stage.

from helipad import *
heli = Helipad()

@heli.hook
def agentStep(agent, model, stage):
    if stage==1: agent.wealth += produce(agent.id, agent.laborSupply)
    elif stage==2:
        agent.utility += sqrt(agent.wealth)
        agent.wealth = 0
        agent.laborSupply = random.normal(100, 10)

To tell Helipad to run this code for every agent every stage of every period, the @heli.hook decorator is added to the top of the function, which is named so that Helipad runs it during the agentStep hook. The decorator can also take the hook name as an argument (e.g. @heli.hook('agentStep')) and be placed above a function with any name.

There are a few things to notice about this example.

  1. Helipad passes three arguments to any function used in the agentStep hook: agent, model, and stage. The documentation specifies the exact function signature to be used for each hook.
  2. The agent and model objects are passed as arguments, allowing access to their properties and methods during each agent step.
  3. Multiple functions can be added to a single hook, in which case they will be executed in the order they were registered (though the prioritize argument can be used to move a hook to the front of the queue).

Agents, Breeds, Goods, and Primitives

Helipad provides for agent heterogeneity through agent breeds. A breed is registered during model setup, and assigned to an agent when it is initialized, whether at the beginning of the model run or through reproduction. An agent’s breed can be accessed with the agent.breed property.

Goods are items that agents can hold stocks of, which are kept track of in the agent.stocks property. Goods are also registered during model setup, along with user-defined per-agent properties (for example, agents might have a reservation price for each good), and can be exchanged using agent.trade(). One good may optionally be used as a medium of exchange, which allows the agent.buy() and agent.pay() functions to be used. Agents can optionally take a variety of utility functions over these goods, including Cobb Douglass, Leonteif, and CES.

Agent primitives are a way to specify deeper heterogeneity than breeds. Primitives are registered using a separate agent class, subclassed from baseAgent, and their behavior is specified using separate hooks. For example, a model might use primitives to distinguish between permanently distinct ‘buyer’ and ‘seller’ agents that share no common code, while using breeds within each agent to distinguish separate buying and selling strategies. An agent cannot switch primitives once instantiated. Agents of a given primitive can be accessed using model.agents[primitive], and by breed within that primitive with model.agents[primitive][breed]. The default primitive is ‘agent’, which is registered automatically at the initialization of a model. Agents of all primitives can be hooked with the baseAgent set of hooks.

One included primitive is the MultiLevel class, allowing for multi-level agent-based models where the agents at one level are themselves full models (Mathieu et al. 2018).  MultiLevel therefore inherits from both the baseAgent and the Helipad classes.

Helipad’s agent class is well-suited to evolutionary models and genetic algorithms  (Holland 1992; 1998). Agents can reproduce both haploid and polyploid through a powerful agent.reproduce() method that allows child traits to be inherited in a variety of ways from one or multiple parents, along with mutations to those traits. Ancestry is tracked with a directed network (see below on networks). Agents keep track of their age in periods in the agent.age property, and can be killed with agent.die().

Parameters

Helipad constructs its control panel GUI primarily with user parameters that allow model variables to be adjusted before and, optionally, during model runtime. Parameters can be registered with model.params.add(). Helipad supports the following parameter types, depending on the format of the variable in question:

  • Sliders, for numerical variables over a range. Sliders can also be set to slide over a discrete set of values, for example on a logarithmic scale.
  • Checkboxes, for boolean variables.
  • Menus, for categorical choices.
  • Checkentries, for a variable equal to the value of a text box if a checkbox is checked, or False otherwise. The text box can be numeric or a string.
  • Checkgrids, for a series of related booleans.
  • Hidden parameters can be retrieved and set in user code, but do not display in the control panel.

Figure 1 above shows six slider parameters and two checkgrids, along with two checkentries and a slider in the top configuration section.

Helipad also allows parameters to be specified on a per-breed and per-good basis, with the parameter taking separate values for each registered breed or good. Current parameter values can be accessed at any point in model code using model.param(). Parameters can also be set in model code, in which case they are also reflected in the control panel GUI.

Helipad provides a shocks API for numeric parameters. A shock consists of a parameter, a timer function that takes the current model time and outputs a boolean, and a value function that takes the current parameter value and outputs a new value. The value function will then update the parameter value whenever the timer function returns True. This can be used for one-time, regular, or stochastic shocks, possibly generating data for impulse response functions as in Harwick (2018). Registered shocks can be toggled on and off before and during the model’s runtime in the control panel.

Spatial Models

Spatial models are the bread and butter of many agent-based models (Banos et al. 2015; Railsback et al. 2012; Wilensky 2015), especially in epidemiology where they are commonly used to model infection spread (e.g. Hunter et al. 2017; Arfin et al. 2016). Helipad can optionally instantiate a spatial map on which agents can move using model.spatial(). Spatial models are created by initializing a Patch primitive and creating an undirected grid network connecting neighboring patches, with optional diagonal connections.

Spatial models can be initialized with a number of geometries:

  1. Rectancular, with x×y dimensions. Wrapping can be toggled in either or both dimensions (i.e. moving past the edge will wrap to the other edge), for cylindrical or toroidal geometries.
  2. Polar, with θ×r dimensions, which are useful in certain ontogenetic models in biology (e.g. Kauffman 1993: 556).
  3. Geospatial, where arbitrary polygonal patches can be imported from GIS files with libraries like GeoPandas. These are especially useful in applied work, for example in urban studies, where the specific topology of local regions is important (Jiang et al. 2021; Meng et al. 2025).

Agents in spatial models acquire position and orientation properties, along with methods for motion appropriate to the coordinate system (i.e. agent.up(), .down(), .left(), and .right() in rectangular geometries, and agent.clockwise(), .counterclockwise(), .inward(), and .outward() in polar geometries). Agents can also move absolutely and relatively, as well as .forward() in their oriented direction. Orientations can be set and accessed in either degrees or radians by setting the baseAgent.angleUnit static property.

Patches are a fixed primitive, meaning they cannot reproduce or move. They can die, however, but unlike other killed agents, can be revived later. Agents on a patch that dies will either die themselves or return None for their patch property, depending on whether agents have been allowed offmap in the initializing spatial() function.

Data and Visualization

Ease of visualization is the key advantage of Helipad over other Python-based agent-based modeling frameworks. Unlike some others which have the user create plots directly in (e.g.) Matplotlib, or require the launch of a webserver, Helipad includes an extensible visualization API to manage a full-featured and interactive visualization window. It can also be easily extended with custom visualizations written in Python, without a frontend/backend division; thus Helipad models – even with custom visualizations – can generally be self-contained in one .py file without becoming unwieldy. Visualizations of various types can be registered in only a few lines of code. For example, in the Price Discovery model (described in the following section) where the lastPrice property is set in the agentStep hook, the following five lines are all that is necessary to collect trade price data and register a live-updating plot of the geometric mean with percentile bars as the model runs.

from helipad.visualize import TimeSeries
viz = heli.useVisual(TimeSeries)

heli.data.addReporter('ssprice', heli.data.agentReporter('lastPrice', 'agent', stat='gmean', percentiles=[0,100]))
pricePlot = viz.addPlot('price', 'Price', logscale=True, selected=True)
pricePlot.addSeries('ssprice', 'Soma/Shmoo Price', '#119900')

Model data is collected each period into reporters, corresponding to data columns, that return a value each period and record column data. The data as a whole can be accessed during model runtime, exported to a Pandas dataframe, analyzed after the model’s run using the terminate hook, or exported to a CSV through the control panel.

Parameter values are automatically registered as reporters. Helipad also includes functions to generate reporter functions for summary statistics (including arithmetic and geometric means, sum, maximum, minimum, standard deviation, and percentiles) of agent properties, as in the code block above, but reporter functions can be entirely user-defined as well. For reporters generated as summary statistics of agent properties, percentile marks and ± multiples of the standard deviation can be automatically plotted above and/or below the mean, as additional dotted lines above and below a time series line, or as error bars on a bar chart. Reporters can also be registered as a decaying average of any strength with the smooth parameter.

There are two included overarching visualizers: TimeSeries and Charts, for diachronic (over time) and synchronic (a particular point in time) data, respectively. Custom visualizations can also be written using the extensible BaseVisualization or MPLVisualization classes, the latter of which provides low-level access to the Matplotlib API while also maintaining important two-way links between the visualizer and the underlying model, such as automatic updating and interactivity through event handling. Both TimeSeries and Charts divide the visualization window into Plots, areas in the graph view onto which data series can be plotted as the model runs.

TimeSeries stacks its plots vertically with a separate vertical axis for each plot, and model time as the shared horizontal axis. The visibility of plots can be toggled in the control panel prior to the model’s run. Plots can be displayed on a logarithmic or linear scale. Once a plot area is registered, reporters can be drawn to it by registering a series that connects a reporter function to a plot area. Figure 1, for example, shows only one active plot with multiple series displayed on it. When this is done, the plot area will update live with reporter data as the model runs. Series can be drawn as independent lines, or stacked on top of one another within a plot with the stack parameter, for example if the sum of several series is important.

The Charts visualizer is divided into a grid of plots, where slice-in-time data of any form can be plotted and updated live as the model runs. The Charts visualizer also features a slider to scrub through time and display the state on each plot at any point in the model history. Bundled plot types include bar charts, network graph diagrams (with a variety of layouts including networks laid on top of a spatial coordinate grid), spatial maps, scatterplots of agent properties, and bar charts, all of which can be displayed alongside one another in the visualization window.

Custom plot types can be registered and displayed within the Charts visualizer by extending the ChartPlot class, and a tutorial notebook for building a custom 3D bar chart visualizer is included. Finally, the appearance of spatial and network plots can be extensively customized, with color, size, and text all set to convey meaningful per-agent information such as breed, wealth, age, ID, and so on.

Fig 3. The Charts visualizer running in Jupyter Lab, displaying Network, Bar Chart, and Spatial plots.

Other Capabilities

In addition to linear activation and random activation style models, Helipad also supports matching models through a match hook that activates agents n at a time (with n=2 for pairwise matching). Pairwise matching is an important feature of models in both monetary theory and game theory. Periods can be divided into multiple stages, with activation style (linear, random, or matching) customizable on a per-stage basis, for example with random activation in the first stage, order preserved in the second stage, and matching in the third stage.

Agents can be linked together with multiple named networks, directed or undirected, and weighted or unweighted. Agents can be explicitly connected with with an agent.addEdge() method, or a network of a given density can be generated automatically. This network structure can be exported for further analysis to dedicated network packages like NetworkX (Hagberg et al 2008). Ancestry relationships, as mentioned earlier, are kept track of using a directed ‘lineage’ network.

Helipad also includes tools for batch processing model runs, most importantly, a parameter sweep function that runs a model up to a given termination period through every permutation of one or more parameters.12 The resulting data from each run can be exported to CSV files or passed to further user code for analysis.

Finally, Helipad supports events, which trigger when the model satisfies a user-defined criterion, registered by placing an @event decorator over a criterion function. For example, an event might be “population stabilizes” or “death rate exceeds 5% over 10 periods”. An event may repeat or not. On the firing of an event, Helipad records the current model data into the Event object and notifies the user in the visualization window: TimeSeries draws a line at the event mark, and Charts flashes the window. Events can be used to stop a model after it reaches a certain state, or to automatically move the model into a different phase.

Performance

Helipad provides a great deal of flexibility and assumes very little out of the box about the structure of the model. This generality does entail some performance cost compared to a pre-compiled model, especially in single-threaded Python, although for moderate numbers of agents the majority of the time cost will be spent in Matplotlib for visualization. A model attribute heli.timer can be set to True to print live performance data to the console during model run, split between model time and visualization. Although some parallelization is planned for a future release, users seeking performant models with tens of thousands of agents or very long model times should consider frameworks in better-optimized languages than Python. Illustrative performance data are shown in Table 1.

Agents101001,00010,000
Visualization184150557
No Visuals1,653553797.4
Table 1. Performance of the Helicopters model (described below) in periods-per-second on an M2 MacBook Pro, visualized with four time series charts.

Example Models

This section describes the various sample models that have been written in Helipad, and the features they exemplify, to give a sense of the variety of models possible. All of the following models can be downloaded from Helipad’s Github page.13 This list, of course, is by no means exhaustive.

Helicopter drops

Harwick (2018), Helipad’s origin and namesake, is a model of relative price responses to monetary shocks in the presence or absence of a banking system, i.e. depending on whether money is injected through helicopter drops or open-market operations. The model features two agent breeds – Hobbits and Dwarves – who consume one of two goods, jam and axes, respectively, and who have differing demand for money set with a per-breed parameter. The relative demand for money balances and goods is determined by a CES utility function. The model also features a store and (optionally) a bank, both registered as separate primitives. The control panel uses the callback argument of model.params.add() to enforce relationships between certain parameters, and post-model analysis using statsmodels is run using the ‘terminate’ hook.

Matching models

A price discovery model with random matching is described in Caton (2020, ch. 9). In this model – significantly, written in under 50 lines of code – agents are randomly endowed with two goods, and repeatedly randomly paired to trade along the contract curve of a standard Edgeworth Box setup with Cobb-Douglas utility. The model runs until the per-period trade volume falls below a certain threshold, leading to convergence on a uniform equilibrium price.

Another matching model is the Axelrod (1980) tournament displayed in Figure 1. In the Axelrod tournament, strategies are assigned to breeds, and paired randomly against other strategies in 200-round repeated prisoner’s dilemmas. As it turns out, the much-celebrated dominance of tit-for-tat is not robust to the collection of strategies it plays against; indeed, in Figure 1, Grudger comes out ahead by a substantial margin.

Evolutionary models

An evolutionary model is described in Harwick (2021), where an agent’s reproductive fitness depends positively on a partially-heritable human capital parameter, but negatively on local population density. The fact that population density increases the economic returns to human capital leads to cyclical human capital dynamics. Not only the mean, but also the variance in human capital  over time can be seen with the plotted error bands.

Similarly, the Bowles & Gintis (2011, ch. 7.1) model of the evolution of altruism through deme selection is included as a multi-level model, with the agents in the top-level model representing competing demes, and the agents of each deme representing individuals. Cooperation is selected against within demes, but demes with a higher proportion of cooperative members are more likely to prevail against and colonize demes with fewer cooperative members. Altruism can survive as a stable strategy provided the benefits to cooperation are high enough, and the likelihood of inter-deme conflict is sufficiently strong. Relative proportions of selfish and altruistic agents are plotted on a stacked plot adding up to 100%.

Spatial models

Conway’s Game of Life (Gardner 1970), a cellular automaton where a grid evolves according to simple rules but whose results cannot be predicted from the initial state without stepping through algorithmically, can be implemented as a spatial model in Helipad in just 27 lines of code, including 5 for interactivity in the visualizer, i.e. the ability to toggle cells on and off.

A standard spatial ecological population model of predator-prey relationships (in this case, sheep and grass) is included. The productivity of grass places a hard limit on the sheep population, especially when the latter reproduce sexually. The model can be easily extended to other coordinate systems, for example polar or geospatial models.

Conclusion

Helipad is a powerful and extensible agent-based modeling framework for Python that ensures a shallow learning curve with a hook-based architecture. It has specialized tools for economic, biological, game-theoretic, and network models, but has ready applications in ecological, epidemiological, organizational, and urban systems – and indeed any context where interacting heterogeneous agents generate emergent structure. The sample models are also written so as to be importable and extensible, with the initialized Helipad object returned using a setup() function.

Since its initial public release, Helipad has added a number of significant modeling and architectural features, and is now API stable, ensuring backward-compatibility for future versions. The source code is open, and as agent-based simulations gain traction in social-scientific work, Helipad has the potential to aid the packagability and legibility of model code – especially in notebook format – as well as to lower the barriers to creating new agent-based simulations in a variety of fields.

Footnotes

  1. Unlike some accounts that try to distinguish the ontology of normativity in general from morality in particular (e.g. Machery & Mallon 2010), we regard the boundaries of a “moral” domain within the domain of normative rules, to the extent one is distinguished, to be culturally defined. The paper will sometimes use ‘moral rules’ metonymously for normativity in general.
  2. While there are some examples of nonhuman animals rewarding or punishing certain behaviors of conspecifics (thus implicitly assigning those behaviors a valence) (Boehm 1999), and some minor examples of cultural transmission such as birdsong or tool use (Whiten 2019), we know of no other animal that transmits normative understandings in an open-ended way.
  3. While “noble lie” arguments have something like this structure, they are not – and indeed cannot be – moral or normative arguments, at least not for the intended believers of such lies. The lie itself, of course, will indeed usually be moral or normative, though the converse (that moral or normative claims are lies) is not true.
  4. There are signaling models where ignoring one’s own self-interest brings benefits sufficiently large to outweigh the potential costs. For example someone who keeps strict accounts of favors owed by friends will have fewer friends than someone who doesn’t “count the cost”, and the increase in the number of cooperative friends might outweigh the costs of exploitative friends. Thus even self-sacrifice can be “ultimately” reduced to narrow self-interest, at least probabilistically. Whether these models are stable over long enough periods against the exploitation of such signals is doubtful without additional assumptions, however (Harwick 2026b).
  5. This argument does not necessarily preclude the Effective Altruist concern with global welfare. What it does suggest is that, in order to accomplish its ends more effectively, a top priority of EA communities should be directing cooperation within EA communities. The 2022 controversy over the Effective Venture Foundation’s decision to purchase a manor house for its own events – with resources that could have purchased malarial bed nets for distant children – suggests that this concern is largely not on EA communities’ radar, although ‘longtermists’ (e.g. MacAskill 2022) are a partial exception.
  6. Popper’s (1945) famous ‘paradox of tolerance’ pertains to this point. The virtue of toleration is an obligation to the community sharing the value of tolerance – which may of course encompass many “thicker” communities – and does not sensibly apply across the boundary of the liberal community.
  7. It is also worth noting that a similar distinction obtains between the universal language faculty and particular learned languages. This is more than a suggestive parallel, as the culturally learned assignment of valence to actions likely rests on the symbolic capacity of a language-possessing species.
  8. The parable of the Good Samaritan is often taken today to entail a mandate for universal altruism (Francis 2020). Historically however, the Church has interpreted the parable as a special obligation to those one is physically near and able to help (e.g. Augustine 397, I.28).
  9. This does not necessarily entail the complete extinction, or even the quantitative diminution, of a weird population considered as a whole. Imagine a community with a distribution of commitment to modern weird cultural norms. If environmentalism preferentially “takes” among the upper tail, which then adopts antinatalism, the trait value of the entire population shifts away from weirdness, and therefore also propensity to care about the natural environment.
  10. Specifically, while agents are themselves objects, there is no way to create user-defined objects.
  11. A complete and up-to-date API reference can be found at https://helipad.dev.
  12. The checkentry is the only parameter type with an open-ended value range, and thus cannot be swept. All other parameter types have finite value ranges.
  13. https://github.com/charwick/helipad/tree/master/sample-models. Some are also available as Jupyter notebooks: https://github.com/charwick/helipad/tree/master/sample-notebooks
]]>
0
Strategies Are Not Algorithms https://cameronharwick.com/writing/strategies-are-not-algorithms/ https://cameronharwick.com/writing/strategies-are-not-algorithms/#respond Tue, 29 Jul 2025 20:57:45 +0000 https://cameronharwick.com/?p=2233 Human social institutions – at every scale from the drug deal to international governance, and every point in time from the paleolithic tribe to the present day – have been fruitfully modeled as repeated games. In this conception, an institution is an equilibrium complex of strategies where each individual plays his best response given the strategies of others that he interacts with, in light of the fact that the interaction will, in some way, persist into the future.

To consider an institution as the scaffolding of a repeated game allows a vast theoretical arsenal to be brought to bear. However, as compared to one-shot games, the space of possible strategies is far too vast for agents to apply, or for modelers to attribute, straightforward optimizing behavior. Therefore strategies are modeled as algorithms mapping from a finite set of observables (including the history of the game) to a finite set of responses.14 These algorithms, to the extent they are shared, constitute the “rules” in the canonical definition of institutions as “rules of the game” (North 1990).

This orthodox conceptual stack, however – grounded in an algorithmic model of behavior on the micro level and culminating in the institution on the macro level – inevitably leads to pessimism when considering the problem broadly enough: if the set of observables is open-ended, there exists no possible fixed and finite-length algorithm under which self-enforcing cooperation on the pattern of actual human institutions is evolutionarily stable (Bowles & Gintis 2011; Harwick 2020). Goodhart’s Law tells us that given enough time for costly signaling equilibria to break down under selection for cost-reduction, no finite-length algorithm can reliably infer a true state of the world from a signal that adversaries have an incentive to falsify (Harwick 2022; Harwick & Caton 2022). Thus, cooperative agents executing a fixed algorithmic strategy cannot identify themselves reliably enough over time to maintain the assortativity necessary for stable cooperation in a behaviorally open-ended world.

It would seem some element of this conceptual stack must give way to explain actual institutions. This paper suggests that algorithmic mappings from observations to responses – “rules” – are an inadequate paradigm for modeling human decisionmaking and organization. The alternative is an interpretive mapping, where open-ended observations are placed into inductively-learned classes through holistic similarity to exemplars, rather than by explicit if-then statements. Interpretive mapping characterizes both biological and artificial neural architecture, as well as other systems, and is able to deal much more robustly with edge cases – especially the adversarial edge cases selected for in repeated social dilemmas – than algorithms. We will thus think of mental models, subjective frames, institutions, and so on, not as stopgaps for the boundedness of our rationality, as they are often conceived in economics, but as something sui generis and functional on their own terms.

Taking these features of interpretive cognition seriously will entail regrounding game theory in a phenomenological mold, not merely because it bears greater verisimilitude to actual human cognition, but because it makes possible a general equilibrium theory of human cooperation that game theory has thus far not been able to offer.

After running through puzzles of human sociality and other large-scale adversarial games that cannot be explained with an algorithmic model, the paper lays out the difference between algorithmic and interpretive mappings, and shows that an interpretive model succeeds at precisely these points. It then considers the complementarities between interpretive and algorithmic systems, the implications for human epistemology and governance – especially the functional role of what has been called “tacit” knowledge – and what strategy selection in an open-ended environment might look like analytically.

Background

Institutions are Recurring Games

North (1990) famously defined institutions as “rules of the game,” which are now typically understood in a properly game-theoretic sense (Greif 2006). These games are repeated in a population, even if not among the same individuals in each instance. We will refer to such games as recurring games, a superset of repeated games, which are played among the same players. An institution, then, is an equilibrium complex of strategies in a recurring game, along with supporting beliefs and expectations. The space of possible institutions is limited to those complexes of rules (strategies) which are locally stable and self-reinforcing.

“Rules,” however, must be unpacked further. They may refer to prescriptive rules or constitutive rules (Searle 1995). The former are injunctions like “do not foul a player” in basketball, and are more easily assimilated to the notion of strategy. In order for “do not foul a player” to be an equilibrium strategy, it must be in no player’s interest to do so, which in this case also requires that it be in players’ interests to defer to a referee, in whose interest it is to penalize fouls. They may be formal or informal, and are the paradigmatic case of an institution as rule.

Constitutive rules, however, define the game itself: a team is constituted by five players. A goal consists in getting the ball through the net. Wide variations from these rules are not prohibited – they do not mean one is “breaking” the rules, rather, one is simply playing a different game. In this way, the cognitive-perceptual element that institutional economists have always identified as an important aspect of institutions (North 2005, Hayek 1937, Lachmann 1971) can be assimilated to the “rules” definition. There are rules that tell you what game you are playing, and there are rules that tell you how to play the game, and both must be in the interests of players in equilibrium.

It is not obvious that these are the same sorts of things, despite both going by the name “rules”. In most formalizations, the cognitive-perceptual element is accordingly backgrounded, despite its emphasis in verbal expositions. We will argue (contra Searle) that the two concepts can indeed be unified into a broader concept of strategy, although doing so will require an alternative formalization of the problem.

Strategies in Recurring Games are Modeled as Algorithms

A dominant tradition in game theory, and decision theory more broadly, models both strategies and decision processes as algorithms. An algorithm is a sequence of explicit instructions mapping from an input to an output, where both are represented using a language of finite countable symbols. The domain of the input and output may be countably infinite (for example an algorithm may take or produce any combination of the symbols in its language), but it may not be continuous or analog. Friedman (1986: 12) explicitly defines a strategy as “a set of instructions,” and Binmore (1987) takes “a rational decision process… [to] refer to the entire reasoning activity that intervenes between the receipt of a decision stimulus and the ultimate decision… Such an approach forces rational behavior to be thought of as essentially algorithmic” – that is, as a well-defined transformation from one symbolic representation to another.

In recurring games in particular, a strategy is formalized as an algorithm mapping from (usually) a subset of the history of the game to a decision in one repetition of the game. One standard formalization is the finite automaton (Marks 1992; Rubinstein 1986), an abstract machine with a fixed number of states that maps game histories (or summaries) to actions. For example, tit-for-tat is a one-memory finite automaton. More complex strategies can be modeled by automata with more states, or by Turing machines (Halpern & Pass 2014), which by the Church-Turing thesis can execute any well-defined algorithm with any well-specified input and arbitrarily large memory. In this sense, computer programs are concrete implementations of such algorithms running on Turing machines.

An algorithm must specify the domain of valid inputs and a rule for mapping each input to an output. Inputs outside this domain are either ignored or misclassified. In the case of human social life, the space of potential stimuli and signals is effectively unbounded. But based on the flexibility of the notion of the algorithm, it has been taken for granted that human strategies can, at least in principle, be modeled by suitable algorithms. Thus, both individual strategies and institutional “rules” are typically understood as algorithmic mappings – explicit procedures producing determinate outputs from bounded classes of inputs. This assumption is foundational to rational choice modeling across domains.

No Finite Algorithm can Sustain Cooperation in Relevant Recurring Games

The question that an algorithmic model of human cooperation must answer is: (1) What are the inputs into the decision process? (2) What are the outputs or affordances of a decision process? And (3) what is the structure of the intervening transformation? All three must be explicit, enumerable, and symbolically representable.

The search for such algorithms in human institutions has been fruitful in a partial-equilibrium sense, so to speak. Greif (1994), for example, shows that, given certain cultural predispositions, various exchange institutions were at least temporary equilibria. Leeson (2012) shows that, given certain religious beliefs, ordeals were a self-enforcing system of justice. Ostrom (1990) shows that, given human propensities to moralize and punish, collective action problems can be reliably overcome in practice.

The general equilibrium problem, however, has been comparatively neglected in economics. To wit: if we endogenize culture and beliefs – and indeed stop taking cultural capacity for granted at all – how rich must we make the inputs and outputs to the decision algorithm to constitute a plausible analog to the decision processes in actual human institutions?15

There are several stylized facts that an algorithm must be able to explain:

  1. Humans are genuinely altruistic (Tomasello 2009) and regularly cooperate with non-kin in one-shot interactions (Henrich & Muthukrishna 2021). Thus direct reciprocity models (e.g. Trivers 1971) fail to capture the extent of human cooperation, and hence the need to model institutions as recurring rather than repeated games.
  2. Humans cooperate in large groups. The Axelrod (1984) tournament showing that cooperation is rational in repeated games does not generalize to n-person games (Bowles & Gintis 2011) for the simple reason that punishing a free-rider by defecting diffuses punishment over larger and larger number of agents.
  3. (1) and (2) are apparently stable, at least on timescales long enough to generate civilizations. A well-specified cooperative algorithm, perhaps even one that bears some resemblance to empirical human psychology, is easy to model. Such an algorithm which is also robust to invasion by non-cooperative strategies is not.

A model will, at the very least, need more affordances than “cooperate” and “defect”, and many efforts have been made in this direction. Targeted punishment of free-riders would be a realistic addition. However – per (1) – punishing defectors is itself costly, and transforms the problem into a second-order free-rider problem, where punishment itself is a public good (Yamagishi 1986) with diffuse benefits and concentrated costs. It remains to be established that punishment itself is any more viable than cooperation in general.

Assortativity (Bergstrom 2003) would seem to be a general solution, where agents can take actions to increase or decrease the probability of entering a game with agents playing other strategies. If cooperators can preferentially match with other cooperators, cooperation can be evolutionarily stable even if it is not a Nash equilibrium – that is, cooperation can proliferate even if it is costly on net for any individual. Similar thresholds for cooperation fall out of a variety of assortative structures, including kin altruism, group selection, network models, and reputation models (Bowles & Gintis 2011).

But consider the problem of maintaining assortativity in a dynamic system. Cooperators must at a minimum be able to identify each other with better-than-random probability (and we cannot assume recognizability without begging the question). So cooperators signal their type to one another and avoid, exclude, or defect against other types. To the extent that cooperative agents can produce some combination of signals at lower cost than noncooperative agents, a separating equilibrium with cooperation can result.

However, in a dynamic setting, Goodhart’s Law erodes the success of any particular algorithm over time: when agents will cooperate on the basis of some set of observed signals, there is selective pressure for defecting agents to also produce these signals at low cost. Free-riding on informative signals – mimicry – is always possible in principle, given enough time in a dynamic setting, a process which must eventually destroy the informativeness of such signals. At the very least, punishing or otherwise preventing the deceptive production of such signals now becomes a second-order free rider problem.

We are therefore caught in the horns of a dilemma. A model of cooperation with assortativity can get off the ground only by stipulating an informative signal, i.e. that there exists a set of signals that are reliably lower-cost for cooperators than for defectors. For example, basic kin altruism models typically take kin recognition for granted; reputation mechanisms (e.g. Kandori 1992) must foreclose the possibility of cheaply alienating identities (Harwick & Caton 2020); spatial and network models rely on the costliness or impossibility of travel (“viscosity”) to rule out migration of defectors into cooperative demes (e.g. Taylor 1992). But the dynamic problem is precisely that, if we do not stipulate such a signal, the informativeness of a signal of cooperation depends on the imposition of costs on false users – and we arrive right back at the second-order free rider problem. Barring that, we have only transient equilibria as algorithms tuned to particular signals become obsolete and exploitable, replaced – hopefully – by selection for new functional algorithms.

Thus for an algorithm to sustain empirically plausible cooperation, we must either solve the second order free rider problem, or content ourselves with churn and red-queen selection for longer and longer algorithms plumbing the combinatorial depths of affordances for informative signals (Harwick 2025). Learning and self-updating can be thought of as ways to increase the complexity of an algorithm, but do not solve the underlying problem if the learning and self-updating are themselves algorithmic. In short, given the open-ended nature of the affordances used in actual human life and the features of human social life necessary to explain, modeling human cooperation as an explicit algorithm must result in a dead end.

Interpretive and Algorithmic Systems

Based on the Church-Turing thesis, which implies that any well-defined mapping can be represented by an algorithm of some complexity, game theory and institutional economics have typically started with simple algorithms and enriched them as necessary to address specific aspects of human interaction. But our argument implies that we can never arrive at a general equilibrium theory of human cooperation this way, at least at finite length. It is not merely the complexity of actual human strategies; the point of models after all is to strip away extraneous complexity and understand the essentials in a tractable way. More important is that the open-ended affordance space of human behavior and observation, in conjunction with the signal-dissipating Goodhart dynamic, drive human strategies to cover the entire affordance space, in principle, and to self-update. Such a decision process could indeed in principle be modeled algorithmically, but it could hardly be less complex than actual human decisionmaking.

But even though algorithms can in principle represent all well-defined mappings, in practice there are nonsymbolic mapping processes that can be understood with considerably less complexity on their own terms rather than as algorithms. Indeed the human brain is one such, and it will be worth distinguishing the way human brains make decisions from the traditional algorithmic model.

The basic problem in decision theory is to establish a mapping between observations and actions. But the domain of sensory input – considered as audiovisual, tactile, or other sensory streams – is nonsymbolic (analog), as is the domain of human action, considered as a sequence of muscular movements. In human decisions, symbolic representation arises – and only sometimes – only in the intervening process by which we translate nonsymbolic sensory streams into meaning, which is then translated into nonsymbolic action.16 In other words, action that we recognize as intentional is informed by semantically meaningful representations (inferences, beliefs, expectations) of unobservable but relevant states of the world that have been constructed and interpreted from sensory streams.

An algorithm can only deal with analog input after it has been interpreted. Interpretation involves discretizing analog input, encoding it into a form representable with a finite language. So most game theory, outside of the subset specifically dealing with signaling games, starts in the middle of this process, with observations already interpreted into formal set-theoretic language, and ends before the selected strategy is translated back into concrete action.17 A finite automaton takes as its input the history of the game, represented as a sequence of action tuples representing each player’s action in each stage, perhaps with some stochastic error, and returns an element of the action space. In the formal representation of the game, the player does not have to ask what did he mean by this, or what do I do about this? By contrast, in actual life it is often not immediately clear whether a concrete action counts as cooperation or defection, nor is it always clear, once a formal action has been decided upon, what concrete actions actually implement it – and indeed these ambiguities may be capitalized upon strategically, both offensively and defensively. As Rubinstein (1991) notes, “it is rare that a situation involving a conflict of interests is described clearly and objectively by a set of rules.”

Mappings to Action in an Adversarial World

The analog vs symbolic distinction would not be problematic for algorithmic models of human decisionmaking to the extent the interpretive process is unambiguous. If this is the case, differences in practice can be catalogued as biases by behavioral economists and social psychologists, and might be empirically interesting, but do not bear on the basic theory. A camera for example can straightforwardly digitize analog visual input, even if individual cameras output images with differences in color balance. Interpretation always loses information compared to the original analog stream, but can represent the analog source with arbitrarily high fidelity with a sufficiently lengthy representation (compare for example the file size of photos generated by a 12 vs a 48 megapixel camera).

But compared to the mapping from visual input to a digital image, the mapping from observed cues to semantic representations of unobserved states of the world, and a fortiori the mapping from observed cues to actions, is not – and cannot be – continuous or well-behaved. The disjunction between these spaces is known in computer science as the Semantic Gap, and due to the Goodhart dynamic described above, no fixed encoding function from cues to semantic representations or to actions can be dynamically stable without active policing (which, again, raises the second order free-rider problem). When the output of an interpretive process is beliefs and expectations that guide a player’s actions, there will always be an incentive for other players to find ways to misrepresent, to induce the player to take actions that benefit the second party at his own expense by producing signals that lead the player to believe the state of the world is other than it is. Over time, we must assume free riders learn to emulate any fixed set of cues.18

For this reason, the similarity structures of the two spaces cannot be isomorphic, as in the case of the digital camera. Small changes in objective cues may lead to wildly different classifications, especially in an adversarial game where the other party has an incentive to falsify the cues he produces (consider, for example, the search for “tells” if one suspects someone is lying). On the other hand, wildly different objective cues may end up in the same class. A stray mark may completely change the character of a painting, but two completely different paintings – or even a painting and a song – may evoke the same feeling.

This also entails that interpretation will not be unambiguous with respect to the “objective” set of signals. Instead, perception is mediated by mental models or frames (Goffman 1974; Devereaux & Koppl 2024). We perceive, not things as they are, but the classes we place them in, partly consciously and largely unconsciously (Hayek 1969).19 At the level of conscious decisionmaking, Nozick (1969) describes how a change in classification can change the dominant strategy, despite nothing about the underlying cues changing.20 Framing effects also bedevil experimental economics: experimenters must be sensitive to the fact that participants can construe the task in ways that are different from the experimenter’s intended construal, especially in cross-cultural research. For example, a participant might assign social valence to actions the experimenter intends to regard as purely instrumental, and behave in unexpected ways.

Economists have traditionally considered such framing effects as artifacts of “bounded rationality” (Simon 1957). An algorithm acting on unambiguously encoded interpretations would presumably be invulnerable to such framing effects, but actual humans approximate it using imperfect heuristics, although these are sometimes acknowledged to be “ecologically” rational (i.e. functional within normal contexts [Gigerenzer & Brighton 2009]). North (1990), despite emphasizing the cognitive-perceptual aspect of institutions, nevertheless treats them as necessitated by cognitive imperfections, and Geanakoplos (1992) reconciles the strong conclusions of Aumann’s (1976) agreement theorem to the reality of disagreement by assuming “mistakes in information processing”.

By contrast, the Goodhart problem suggests we should regard such departures, not as unfortunate but inevitable computational limitations, nor as kludges of an evolved system, but as necessary adaptations to an open-ended and adversarial world.

Standard formalizations of game theory, following Aumann (1976), rule out framing effects by regarding the state space as an objective fact, and requiring that it be partitioned by agents in a manner independent of their own actions (thus “the horse I bet on” is a category that depends on my own actions), even if they may partition it differently. But from the perspective of actual human cognition, states of the world cannot be classified in a manner independent of our own actions, because the very purpose of perception is as a prelude to action, and mental representations are largely encoded as potential actions (Clark 2001: 93). Claxton (2015: 64) argues that “perception’s job is scoping out the possible ‘theatre of action’ – a sense of all the things that current circumstances permit me to do – so that I can select and craft my actions appropriately.” The delineation of “the horse”, “to bet”, and other semantic components of the category as meaningfully discrete objects are not objective features of the world, but perceptual artifacts that mark useful potential subjects of action. This will be more true the more abstract and socially constructed the arena of action, for example voting. Much human perceptual architecture is shared, hence the appearance of relatively objective category boundaries for more concrete objects. But in general, agents with different affordances, different utility functions, and so on, will not merely disagree about the state of the world; they will potentially disagree, as in Nozick’s example, about what constitutes a class of events.

Thus unlike the standard Aumann axiomatization, in an adversarial world with open-ended affordances, we should expect any mapping with an action codomain to be:

  1. Discontinuous in a static sense, meaning that the size or direction of a perturbance of the input will be unrelated to the size or direction of the change in the output. Responses to sensory input are not mechanical or predictable. This raises the complexity costs of exploiting a given behavioral rule.
  2. Unstable in a dynamic sense, in that it must be possible to update behavioral rules under exploit. Indeed the instability will push in the direction of increasing discontinuity under adversarial pressure.

Interpretive Systems are Robust to Goodharting in Open-Ended Worlds

Having considered the diachronic problem of how interpretive systems are shaped by an adversarial world, consider now the synchronic problem of how such a system, shaped as it is, approaches an adversarial world.

Specifically, consider how a recurring prisoner’s dilemma would be approached, not by a finite automaton in a formal model, but by a connectionist system like the human brain in an open-ended world. Human infants exhibit prosocial behavior at age 14-18 months (Tomasello 2009), both helping, cooperating, and shunning or punishing antisocial behavior (even as a third party). All this appears well before the language faculty, and even before object permanence, suggesting that prosociality is deeply ingrained at a functional level, and that the early human environment was sufficiently assortative to make this pay off. In order to implement such a strategy however, the problem becomes: (1) how to construe the game at hand (what counts as cooperation or defection)? And (2) what cues to use to determine whether the opposing player is a cooperator or a defector type? These interpretive problems must be solved through learning – hence the significance, and the variety, of human culture and institutions.

A connectionist system is structured as an activation network of neurons, in principle simple gates that transmit a signal to further neurons when an activation threshold is reached based on signals transmitted to it.21 Suppose, to make the weakest possible assumption, an infant enters the world with neurons connected entirely randomly, such that there is no systematic relationship between stimulus and response (this is the starting point of Hayek 1952).22 Over time, given some minimal reward function mapping states of the world (especially those resulting from one’s own actions) to a valence, connections that result in actions that bring positive results will be reinforced, and connections that result in actions that bring negative results will be pruned.

The result is that “similar” input results in similar patterns of activation, where similarity is judged by relevance to the agent’s goals – which as we have seen must be discontinuous with respect to the input. Such a system “naturally classifies and generalizes. All initial states in the same basin converge to the same attractor and hence are classified as identical” (Kauffman 1993: 228). Having been cheated, or having observed someone else being cheated, the brain learns to associate those cues with the free rider concept and respond with appropriate actions. Thus even an agent with inbuilt prosocial tendencies must learn to classify input in order to discriminate between cooperative and noncooperative types, but we may imagine counterparties categorized as such on the basis of completely disparate cues. As Tolstoy might have said, every free rider is untrustworthy in his own way.

Furthermore, because classification in an interpretive system is based on holistic similarity to exemplars along unprespecified dimensions,23 categories of varying breadth can arise endogenously based on reinforcement rather than being prespecified as with an algorithm. By contrast with the formal structure of the prisoner’s dilemma, various forms of free riding or defection must be dealt with in different ways, because one’s own defection or punishment maps to different concrete actions in different situations. One may stop doing business, one may initiate a lawsuit, one may shout, etc. Even if the ‘cooperator’ and ‘defector’ categories are innate enough to think of humans as broad-domain altruists, both the translation of cues into those categories, and the translation of those categories into action, must be learned inductively.

The result is, like an algorithm, a mapping between inputs and outputs. Unlike an algorithm, the intervening process consists in connections whose continual updating along Bayesian lines is built into the construct, without a clear distinction between memory and program, and without the need for constructs like self-updating algorithms.24 Also unlike an algorithm, the connectome is nonsymbolic, although (as the language faculty shows) certain structures can emulate explicit algorithms through symbol-processing capabilities.

On the one hand, the fact that this process is individualized provides some protection at the population level. Like immune systems, to the extent there exists variance in the mapping function across individuals in a population level, strategies optimized to exploit one individual can fail to generalize. There may be some people who will fall for Nigerian Prince scams, but a scammer does not know in advance who they are, which limits the scale of specialized parasitic strategies.

But on the other hand, these mental models are largely shared, partly because of shared low-level architecture (I can be reasonably certain that another human will identify discrete objects in an image in the same way I do), and partly because, despite the persistent threat in social dilemmas, human social life presents itself to members at a low level as a coordination game such that it pays to construe recurring games in the same way as the rest of the population (and indeed punishment can convert social dilemmas into coordination games). The cognitive-perceptual element of institutions pointed to by North (1990) and Greif (2006) can be assimilated to the “rules of the game” definition this way: when mental models converge across a population (Denzau & North 2004), they become constitutive rules defining a recurring game such as a credit transaction or a chess game. While mental models will never converge entirely (Devereaux & Koppl 2024), institutions, as sets of constitutive rules defining recurring games in human society, nevertheless stand as “points of orientation” (Lachmann 1971), and become intersubjective rather than merely solipsistic. These shared frames can stabilize meaning, but – as we argue below on bureaucratization – always at the risk of exploitation. Institutions must, therefore, be understood as intrinsically dynamic even if the basic perceptual architecture of the humans constituting them is stable.

A connectionist system is a feasible implementation of the requirement for a discontinuous and unstable mapping function between input and action. Thus in the case of concrete social dilemmas, the brain can learn both over the course of development and in day-to-day interactions (1) how to construe a concrete situation as a social dilemma, and (2) what sensory cues reliably indicate a cooperative partner, even in one-shot interactions. Historically ethnic, religious, and sartorial markers were used or developed for this latter purpose (Harwick 2023) and can remain in that role if policed against free riders; in modern societies, cues of authenticity serve a similar purpose due to large communities necessitating an even more fluid association between concrete signals and trustworthiness (Greif 2006b describes the breakdown of the former regime as the cost of falsifying signals of communal membership fell). By comparison to a rigid algorithm, this constant process of Bayesian updating and category induction – though by no means invulnerable to gaming – is much better equipped to deal with the dynamical problem of maintaining sufficient assortativity to stabilize cooperation.

Natural and Artificial Interpretive Systems

The classical computer and the human brain are reasonable exemplars for the algorithmic/interpretive distinction. But the human brain is not the only connectionist system, and connectionist systems are not the only interpretive systems. Likewise classical computers are not the only algorithmic systems.

The human language faculty for a long time gave the impression that symbol-manipulation was intrinsic to intelligence and cognition, an assumption central to a tradition in analytic philosophy running from the formalism of the logical positivists to later computationalists. But nonhuman biological neural systems also categorize input along the same lines, even without the recursive symbolic capacity that humans (and only humans) have. Indeed, on this view, the explicit algorithm-emulation that humans do is a culturally scaffolded skill exapted from the language faculty, not even intrinsic to the language faculty itself, much less to cognition in general. Like the problem of matching observed cues to an innate prosocial strategy, nonhuman animals also face the problem of interpreting open-ended observations for the purpose of innate strategies. A robin is attuned to cues that identify her own hatchlings, a squirrel to cues of underground caches, and so on.

Artificial neural networks (ANNs) likewise work on the same connectionist principles as the brain, although the architectural specifics differ. ANNs are decades old as a concept, but it is only in the past few years that the computational capacity and architectural refinement have progressed to the point of sophistication rivaling a human brain in tasks like text or image classification. Although ANNs do operate on digitized input such as text or photographs, the distinction between interpretive and algorithmic systems is not simply a matter of taking analog versus digital input, but the interpretation of input semantically versus symbolically, with a discontinuous and unstable mapping between the two. Consider two photographs of the same object from a slightly different angle that have no pixels at all in common. It would be very difficult to write an explicit and general algorithm classifying them as similar. But considered as a representation of a visual scene – that is, a depiction of a theatre of action an artificial neural network trained to classify visual input in a similar semantic space to a human brain can robustly identify them as depicting the same object.

ANNs do run on an algorithmic substrate. But it would be a mistake to regard AI decisionmaking as “algorithmic”, as if this were a synonym for computer-based. Indeed, ANNs exhibit many of the same epistemic features as humans – but not classical computers. Most significantly, unlike algorithms which can always be inspected by a profiler to determine why a given output was produced from an input, this is possible neither with natural nor artificial neural networks (Sørgaard 2023), leading to a field of “interpretability” research that stands in the same relation to ANNs as psychology does to human minds (Xu & Yang 2025; Lindsey et al. 2025).

On the other side of the dichotomy, rule-governed symbol manipulation can be done by a variety of non-electronic systems. DNA is a significant example, whose codons can be thought of as algorithmic instructions for the construction of proteins (Deacon 2021), out of which life as we know it on earth is constructed. On the other hand, a genome at the population level over phylogenetic time might be thought of as a non-connectionist interpretive system, in the sense that selection acts as a reward function adapting a genome over evolutionary time to an open-ended space where it faces adversarial interactions with both conspecifics, parasites, and predators (and these categories may have substantial overlap). The output space is phenotypic rather than behavioral or semantic, and the mechanism is not computational except in a very loose sense or mediated by semantic representations. But in broad terms the same adversarial considerations apply, and analogous features may be expected: the similarity structures of genome space and phenotype space are similarly non-isomorphic, in the sense that small genetic changes can lead to large phenotypic changes, and large genetic changes can lead to no phenotypic changes. The genome “learns” to adapt to its environment, but because the genome is rigid and rule-bound at any point in time, it does so via the sort of selective churn that we argued was unnecessarily pessimistic in the case of human behavior.

Thus the relationship between algorithmic and interpretive systems is not necessarily a dichotomy, but a stack, sometimes with multiple layers. Interpretive systems can be built on an algorithmic substrate, and though the symbolic content of the substrate (the firing of the neurons of a natural or artificial system) may be perfectly perspicacious, the semantic import of that content will remain opaque. Because the mappings between the perceptual, semantic, and action domains are discontinuous and unstable, as Hayek (1952: 179) argued, “we shall never be able to bridge the gap between physical and mental phenomena”. Phenomenology, the subjective “inside” view, can never be totally reduced to a neurological “outside” view.

Implications

Strategic and Epistemic Dimensions of Moral Philosophy

The discontinuity of the mapping between sensory input and semantic space has been widely noted even in precursors to Hayek (1952) (see Lewis 2016 for a survey). But that these features should be necessary for strategic reasons has not, at least not in economics.

Although an algorithmic expansion of an interpretive mapping is always possible in principle, it is not always possible at finite length. It is not simply that the input space of interpretation is continuous, but that the dimensionality is undefined: the system cannot necessarily know in advance what it is looking for. This poses hard limits to both human introspection, and to AI interpretability research. It will not always – or even often – be possible for a person to explain why he perceives something the way he does, or why he took the actions he did, because to do so would require the translation of a vast nonlinear classification system over an unprespecified set of cues into a linear language-based algorithm. Any such fully faithful explanation cannot come from a system itself, but only from a system of greater complexity. (Hayek 1952: 189).

With this in mind, any definition of “consciousness” as the capacity for self-modeling must take care not to assume greater powers than minds can logically possess. Indeed, verbalized introspection tends to be post-hoc using the same faculties we use to explain the behavior of others (which is not to say that introspection is never, or even infrequently, accurate or useful, any more than the attribution of intention to others is inaccurate or unuseful). That this is a structural limitation can be seen in the fact that chain-of-thought tokens output by reasoning models are similarly post-hoc and not necessarily reflective of the model’s actual reasoning process (Chen et al. 2025; Lanham et al. 2023). Thus if we do model ourselves using the same capacity as we use to model others, the metaphysical question “is it conscious?” is not separable from the epistemic question “how do we know if it’s conscious?” – meaning it’s inseparably tied to the specific structural-functional homologies relied upon by the interpretive capacity of the human brain. Lacking the specific context of structural-functional homology, it is not clear that the question “can AIs (or animals, or aliens, or…) be conscious” is meaningful.

This suggests that a theory of moral value (which we may regard as a foundational institution in any society) cannot be based on gradations of consciousness. Consider the strategic limitations placed on moral philosophies by the requirement that they be, at least, evolutionarily stable (that is, compatible with their own persistence).25 Because we limit the space of moral rules (and institutions more broadly) by evolutionary stability and not Nash equilibrium, moral behavior can be genuinely altruistic and self-sacrificing. However, it cannot be both altruistic and universalist (Choi & Bowles 2007). A minimal requirement of an institution, and by extension of a moral norm, is to maintain the assortativity within which cooperation is stable. The question of moral value – whom is it obligatory to cooperate with under what circumstances? – concerns precisely this.

It may be the case that propositionally universalistic philosophies are not functionally universalistic in practice; indeed this has likely been the case since the dawn of the Axial age due to viscosity in population movement. However, the increasing latitude that modern social organization and connectivity afford for taking propositional content seriously (Harwick 2023) and expanding one’s circle of moral concern suggests, at least, the desirability of a moral philosophy that would remain evolutionarily stable if its propositional content were taken seriously.

Algorithms and Interpretation in Modern Institutions

The fact that human decisionmaking is fundamentally interpretive and nonsymbolic rather than algorithmic has practical implications for architectural questions of large-scale human cooperation such as the legal system or the rule of law.

If an algorithmic expansion of an interpretive rule is not feasible, the interpretive rule encodes “tacit knowledge”, that is, real knowledge that can be acted upon, but not articulated or justified. The concept was used by Hayek (1945) to argue that the algorithmization of economic allocation (i.e. central planning) would necessarily fail to account for a great deal of load-bearing knowledge in the economy. It is not merely that the “man on the spot” with “knowledge of time and place” has more knowledge of the relevant resources, but that entrepreneurship with respect to a resource is an interpretive process, and an algorithmized economic plan will fail to account for a great deal of knowledge, even if it has access to the same perceptual data as the man on the spot.

The argument here, however, also implies the strategic functionality of tacit knowledge. Especially in adversarial situations, holistic and exemplar-based classification has the potential to account for a much greater set of cues than would be possible explicitly (Harwick 2025), a crucial advantage against free-riding mimics of cooperative signals. Indeed, the need to stay ahead of mimics in the Goodhart arms race is a plausible driver of the evolution of human intelligence (Cosmides & Tooby 2005).

Besides the architectural stacking discussed earlier, algorithms and interpretive systems can stack institutionally as well, and indeed many processes that we are used to considering as algorithmic are only so after an act of interpretation. Law, for example, is sometimes idealized as the algorithmic application of rules to cases. But the question at hand in any given legal case is not generally “what is the appropriate remedy for this kind of case”, but “what kind of case is this?” A defendant and a plaintiff bring a case precisely because they place the situation in different categories, or at least have opposing interests as to which category the situation is placed into. A judge can deterministically apply the law after he has made the interpretive judgment placing the case into a relevant category. Suppose the rules for judging a situation to be murder versus self defense were able to be specified in advance. Then the Goodhart problem implies that, over time, murderers would be increasingly able to produce signals inducing the judge to place the case in the category of self-defense. The judge must be able both to take account of an unprespecified body of evidence, and to update the rule-in-practice should it prove inadequate to the case (and he must have a basis for judging it inadequate). Judges can, after all, make reasonable determinations in wholly unique cases – indeed every case is unique along many dimensions.

Thus it cannot the case that (in the common crypto mantra) “code is law”, as if law could dispense with the interpretive base. But the institutional stack can be inverted, with semantic meaning attributed to algorithmic outputs. For example, the proposal to use immutable blockchain ledgers for titling of physical or intellectual property (Allen et al. 2021) provides an objective and algorithmic process for the transfer of an electronic asset. However, to regard this electronic asset as a title to, say, a piece of land, requires it to be an input into a legal interpretive process.

In both cases, interpretation is necessary in adversarial subgames of an institution, and algorithms suffice for cooperative subgames (hence their substitutability on some, but not all, margins). This can formalize Schumpeter’s (1942) worry about overbureaucratization of both private and public institutions: an overbureaucratized corporation (that is, one beholden to explicit processes) will not be able to adequately respond to the novelty necessarily generated by adversarial market competition. Indeed the novelty generated by entrepreneurs in the Schumpeterian tradition can be seen as the analog to the Goodhart process within the bounds of market competition. In an open-ended space of consumer wants, it will always be possible to produce a product consumers prefer over the status quo, although it will not necessarily be predictable. Thus entrepreneurs must use judgment (i.e. an interpretive process) both to proactively seek such possibilities, and to respond to competitors doing so (Foss & Klein 2012). An overbureaucratized corporation will have difficulty at precisely this point.

In public institutions similarly, the rule of law is sometimes idealized as algorithmic, predictable, and general. Nevertheless, large polities with bureaucratic procedures for things like permits or benefit qualifications, will face an unfavorable tradeoff between errors of omission and errors of commission. If denying a valid claim is politically costly, many invalid claims will be approved (for example benefit programs). If approving an invalid claim is politically costly, many valid claims will be denied (for example construction permits). The more formalized the decision procedure, the worse this tradeoff becomes – and indeed the Goodhart problem implies it will become worse over time for any level of bureaucratization, unless interpretive processes supervene to update bureaucratic rules.

Methodological Implications: Phenomenological Game Theory26

There exists a long tradition in the social sciences of an interpretive or hermeneutical approach (Lavoie 1990), reaching into economics through economic sociology (Weber 1956), continental phenomenology (Bergson 1934; Mises 1966), and Verstehende or Gestalt psychology (Hayek 1952). It has, however, been almost entirely eclipsed since the first half of the twentieth century by an axiomatic-formalist approach that assumes away the interpretive process.

Game theory was developed during the rising tide of this formalist approach, and in no small part contributed to its triumph. By comparison with the obscuritanist tenor of continental philosophy, formalism was precise, mechanical, and lent itself to quantitative study. But the dilemma today is no longer between logical precision and faithfulness to human interiority: we now have a well-developed theory of the mechanics of connectionist systems as interpretive devices, as well as the more recent technological ability to implement full-fledged and commercially viable connectionist systems in silico.

By contrast with both classical and evolutionary game theory, which have no role for the interpretive process, it will be worth sketching a phenomenological approach to game theory. This does not demand a rejection of standard game-theoretic tools. However it does entail flipping the stack, so to speak, with a signaling problem as a superset rather than a subset of games. And it is worth bearing in mind that the motivation for such a regrounding is not (just) the greater versimilitude to actual human cognition, but the prospect of a general equilibrium theory of human cooperation against the Goodhart tide that traditional game theory has yet to offer.

Game theory has traditionally been divided into classical and evolutionary branches, which share many of the same formal tools, but differ in their underlying processes and scope of application. In classical game theory, like standard economics, individuals are forward-looking and rational: they make decisions and arrive at rules for behavior on the basis of their expected utility. The knowledge assumptions for a classical game-theoretic analysis, however, are stringent and presume unrealistic sophistication. Players must have common knowledge of the game structure and of the rationality of other players,27 and be computationally unbounded optimizers.

On the other hand, in evolutionary game theory, like in behavioral ecology, individuals are backward-looking and myopic: they make decisions on the basis of hardcoded rules that may or may not be conditioned on memory or present observation, and equilibrium is achieved when one such rule (or a set of such rules) can maintain robustness against randomly introduced new strategies. By comparison with classical game theory, such automata seem substantially less sophisticated than actual humans. It is often used in accounts of animal behavior, but sometimes human institutions are given an evolutionary analysis by authors who downplay the powers of human rational faculties (e.g. Hayek 1982; Henrich 2016).

Both approaches start with a formal description of the game. In classical game theory, as noted before, the analysis starts where players’ interpretive process ends. Players must be assumed to understand the game in the same way. In evolutionary game theory, the modeler does not ascribe any interpretive powers to the players, but the modeler himself interprets the situation, removes extraneous features, and sets up a stylized game.

On top of these foundations, signaling games can be constructed. The game structure may include affordances for communication, these may be more or less costly, they may be entirely arbitrary as to their content. Equilibria are then solved for, which can be separating or pooling, depending again on various features of the formal setup. On this approach, signaling games are a subset of the entire space of games.

By contrast, the phenomenological approach takes the signaling problem to precede the formal specification of the game. Thus, signaling interactions involving the interpretation of open-ended signals are a superset of the set of games.

In the barest core of the phenomenological approach, an agent is conceived as some system having (1) a set of perceptual capabilities such that the outside world can affect its internal state, and (2) a set of affordances such that its internal state can affect the outside world,28 and (3) some evaluative function over states of the world, which may be inside (as in classical game theory – a utility function) or outside (as in evolutionary game theory – a fitness function) the boundaries of the agent itself. The problem facing the agent is to translate perceptual input (observation) into appropriate actions to affect the state of the world in such a way as to climb the hills of its evaluative function. In contrast to classical game theory, which begins with an objective state space that agents may partition in various ways to represent uncertainty, the core epistemic primitive in phenomenological game theory must be a signal space (Harwick 2025, for example, takes this approach).29

In this minimal setup, strategies are not encoded as algorithms to be executed in a formal game, but as responses to observational cues, which can then be abstracted into a formal game. Of course, just as not every exercise in applied game theory needs to resort to the epistemic formalisms behind the common knowledge constructs provided something like common knowledge is empirically plausible, neither will they need to resort to an explicit signal space in many ordinary cases. Indeed, the very purpose of institutions is to harness the human social and communicative faculties to create shared mental models, to establish common knowledge as to what counts as cooperation, defection, and so on, in what contexts – precisely what makes it possible to conduct a classical game-theoretic analysis.

To the extent we are content to start an analysis from an equilibrium ruleplex like this, standard analyses suffice. To the extent we wish to discuss institutional change, behavior in the absence of an established institution, or the origin of the human proclivity to coordinate such models (including institutions and language) in the first place, then we must recur to a phenomenological approach rather than an objective state space.

The interpretation of the game may happen within or outside of the agent’s own boundaries. To encompass classical game theory, where interpretation is done by the agent himself, we note that the agent’s utility function depends on unobservable states of the world, including future states. The mapping function within the agent is sufficiently complex that we may decompose the mapping into (1) mapping from observations to inferences or expectations about these unobservable or future states, and (2) mapping from inferences and expectations to concrete actions.

By explicitly considering the phenomenological problem before the formal solution, we thus consider expectation formation, not as a background to strategy selection, but as a core component of complex strategies involving internal representation. The question of common knowledge, upon which much philosophizing and soul searching has been done in game theory (see e.g. Binmore 1987), is fundamentally a question of response to cues – and no wonder that should seem a loose end in a formal algorithmic approach.

Conclusion

To this point, though the principles of human neural organization have been well understood at an abstract level for decades, the formalistic turn in game theory has cemented an algorithmic model of behavior that is not just a simplifying assumption, but an important theoretical roadblock to developing a theory of cooperation in an open-ended world. By taking both human and artificial perception to be interpretive systems situated in an open-ended world, we raise dynamic problems that are entirely invisible when modeling algorithmic systems in closed worlds.

Now, with widespread artificial connectionist systems to suggest what is inherent to connectionist systems in general as opposed to specific to human or mammalian brains, we are in a position to reground the epistemology of models of strategic interaction in a manner both tractable and faithful to the way actual humans make decisions in open-ended worlds.

Footnotes

  1. Unlike some accounts that try to distinguish the ontology of normativity in general from morality in particular (e.g. Machery & Mallon 2010), we regard the boundaries of a “moral” domain within the domain of normative rules, to the extent one is distinguished, to be culturally defined. The paper will sometimes use ‘moral rules’ metonymously for normativity in general.
  2. While there are some examples of nonhuman animals rewarding or punishing certain behaviors of conspecifics (thus implicitly assigning those behaviors a valence) (Boehm 1999), and some minor examples of cultural transmission such as birdsong or tool use (Whiten 2019), we know of no other animal that transmits normative understandings in an open-ended way.
  3. While “noble lie” arguments have something like this structure, they are not – and indeed cannot be – moral or normative arguments, at least not for the intended believers of such lies. The lie itself, of course, will indeed usually be moral or normative, though the converse (that moral or normative claims are lies) is not true.
  4. There are signaling models where ignoring one’s own self-interest brings benefits sufficiently large to outweigh the potential costs. For example someone who keeps strict accounts of favors owed by friends will have fewer friends than someone who doesn’t “count the cost”, and the increase in the number of cooperative friends might outweigh the costs of exploitative friends. Thus even self-sacrifice can be “ultimately” reduced to narrow self-interest, at least probabilistically. Whether these models are stable over long enough periods against the exploitation of such signals is doubtful without additional assumptions, however (Harwick 2026b).
  5. This argument does not necessarily preclude the Effective Altruist concern with global welfare. What it does suggest is that, in order to accomplish its ends more effectively, a top priority of EA communities should be directing cooperation within EA communities. The 2022 controversy over the Effective Venture Foundation’s decision to purchase a manor house for its own events – with resources that could have purchased malarial bed nets for distant children – suggests that this concern is largely not on EA communities’ radar, although ‘longtermists’ (e.g. MacAskill 2022) are a partial exception.
  6. Popper’s (1945) famous ‘paradox of tolerance’ pertains to this point. The virtue of toleration is an obligation to the community sharing the value of tolerance – which may of course encompass many “thicker” communities – and does not sensibly apply across the boundary of the liberal community.
  7. It is also worth noting that a similar distinction obtains between the universal language faculty and particular learned languages. This is more than a suggestive parallel, as the culturally learned assignment of valence to actions likely rests on the symbolic capacity of a language-possessing species.
  8. The parable of the Good Samaritan is often taken today to entail a mandate for universal altruism (Francis 2020). Historically however, the Church has interpreted the parable as a special obligation to those one is physically near and able to help (e.g. Augustine 397, I.28).
  9. This does not necessarily entail the complete extinction, or even the quantitative diminution, of a weird population considered as a whole. Imagine a community with a distribution of commitment to modern weird cultural norms. If environmentalism preferentially “takes” among the upper tail, which then adopts antinatalism, the trait value of the entire population shifts away from weirdness, and therefore also propensity to care about the natural environment.
  10. Specifically, while agents are themselves objects, there is no way to create user-defined objects.
  11. A complete and up-to-date API reference can be found at https://helipad.dev.
  12. The checkentry is the only parameter type with an open-ended value range, and thus cannot be swept. All other parameter types have finite value ranges.
  13. https://github.com/charwick/helipad/tree/master/sample-models. Some are also available as Jupyter notebooks: https://github.com/charwick/helipad/tree/master/sample-notebooks
  14. Although the tendency in classical game theory is to establish the existence of a Nash equilibrium nonconstructively – that is, without an explicit algorithm mapping from inputs to an action – to deny the feasible computability of an equilibrium would effectively foreclose the use of game theory for behavioral prediction (Daskalakis et al. 2009).
  15. The following whirlwind tour of partial explanations for cooperation draws on Harwick (2020), which elaborates on all of these.
  16. Even action whose intention and effect is the production of symbols – say, speaking, or typing a paper –  nevertheless consists at the lowest level of nonsymbolic motions of jaw, larynx, fingers, etc.
  17. Even classical signaling models stipulate a closed domain of observation and action, even though the process of mapping between the two is endogenized.
  18. This argument is broader than just semantic mapping; it applies to any signal with positive leverage (i.e. that provides more benefit than it costs to produce). Even at the neurological level, to the extent there exists a fixed and well-defined mapping from neurotransmitter levels to actions, endoparasites can modulate the former for their own benefit (this is how parasites like hairworms commandeer insects to kill themselves as part of their reproductive cycle). Hence in higher organisms at least, where sufficient metabolic capacity exists, such pathways are extraordinarily opaque and byzantine, far beyond what would be necessary for the basic problem of internal signaling (Del Guidice 2019).
  19. At an even lower level, sensory streams are not just interpreted by the brain; they arrive at the brain already highly interpreted (that is, classified) by neurons in the eye itself (Lettvin et al. 1959).
  20. Nozick’s example compares a payoff matrix having columns “Horse 1 wins the race” and “Horse 2 wins the race” with a matrix having columns “The horse I bet on wins” and “The horse I bet on loses”, and constructs an example where the expected value of betting on horse 1 is higher than that of horse 2 in the first matrix, and lower in the second matrix. Rubinstein (1991, §6) also discusses an example where adding the possibility of an irrelevant action like player 1 disposing of one dollar before the start of a battle of the sexes game can perturb the partition of strategies in a way that results in the Nash equilibrium favoring player 1.
  21. Actual biological neurons – especially in the human brain – are somewhat more complex than this, allowing substantially greater computation to take place in proportion to neuron count compared to animal or artificial neural networks (on which see below) (Aizenbud et al. 2025; London & Häusser 2005). Nevertheless, as a model of input classification for the purposes of decision theory, and as an alternative to the Aumann axiomatization, a “pure” connectionist network suffices in principle. Recent advances in artificial neural networks also show such a model to be operationalizable in practice.
  22. Something like this is likely more literally true for more evolutionarily recent faculties. For example, infants undergo a babbling period during which the phonemes produced gradually come to match those found in the ambient language (Oller 2000).
  23. This is the epistemic status of the ideal type in interpretive sociology (Weber 1956), a central exemplar to which actual observations can be compared.
  24. The literature on complex adaptive systems (Holland 1992) bears a great deal of similarity to interpretive systems as described here, and indeed the human brain and the examples in the following section have been characterized as CAS. However, CAS are often modeled as an evolving rule set, where the rules may be algorithmic at any point in time. Interpretive systems as described here, on the other hand, are not merely evolving or self-updating, but classify in a holistic and nonsymbolic manner rather than as a sequence of if-then statements. We argue this is both a simpler and a more realistic idealization than the self-updating CAS in light of the strategic limitations of algorithms.
  25. This argument is pursued at greater length in Harwick (2025b).
  26. See Devereaux & Koppl (2024) for a more formal treatment of a similar argument.
  27. Greif (2006), however, thinks of institutions as the creation of common knowledge resulting in a self-confirming equilibrium (Fudenberg & Levine 1993), rather than the result of common knowledge. This may weaken the informational requirements for an analysis, but it also excises the question of origin and nature from the study of institutions.
  28. This kind of input-output boundary is sometimes modeled formally as a Markov blanket (Friston 2013; Fields & Levin 2022).
  29. Rubinstein’s (1991) “perceptive interpretation of the notion of a game” is also an attempt to understand state space in a phenomenological manner.
]]>
0
The University in the AI Era https://cameronharwick.com/writing/the-university-in-the-ai-era/ https://cameronharwick.com/writing/the-university-in-the-ai-era/#comments Fri, 06 Jun 2025 19:01:21 +0000 https://cameronharwick.com/?p=2224 We are well into the process of AI upending higher education. It’s unclear what the university will end up looking like in the AI era – or even if there’s a role for universities at all. I’m confident there is, in principle, but it’ll involve a major retooling at the level of classroom experience. The good news is, at that level, it doesn’t have to wait for administrators. The bad news is, the future of the university as an institution depends on what instructors do today.

The Value Proposition of the University

To imagine the role for the university in the AI era, we need to be clear about what its role is today – and what its role is not. It’s no cynicism to note that the value proposition is not students paying to learn. No doubt a lot of learning does go on in universities. And many people are excited and/or scared about the prospects of one-on-one AI tutors upending the traditional classroom.

But that would be a total misreading of the purpose of higher education. In this respect, LLMs are no more poised to upend higher education than MOOCs were in the late aughts (remember those?). There have been free online courses for whatever you like since 2008, self-taught learners have a vast array of high-quality resources at their disposal, and the threat to universities has been… basically none.

So why pay for a university experience? It’s not just the social experience: after all, people still pay for online classes!

Instead, the value of the university is that a diploma represents a stamp of approval. An employer can look at a diploma and infer certain things about a candidate that would be easy to fake in an interview: things like problem-solving ability, the stamina to follow through on long-term goals, and – yes – the skills learned in class. On the basis of this stamp of approval, employers are willing to pay, on average, 61% more for an employee with a diploma compared to one without, even one who’s only one course shy of a degree. Compared to self-learning, that stamp of approval is easy to employers to verify, and therefore valuable for students to acquire.

So the value of the university lies not in its ability to teach, but in its ability to distinguish students with, from those without, valuable skills and traits. And traditionally, teaching is exactly how it does this. Lectures, exercises, and tests. Students who pass the gauntlet can be pretty reliably assumed by employers to have skills and traits that make them more valuable as employees.

The Real Threat of AI

But a great deal has changed in the past couple years. Even more than Chegg (the cheating clearinghouse where answers would be posted online), ChatGPT makes it easy to breeze through college – easy enough even to automate financial aid fraud at scale. Just plug homework questions, essay prompts, and whatever else, into ChatGPT and be done with it.

From a student’s perspective, getting a diploma is much, much easier now. And professors have been tempted to respond in a few ways:

  1. You’re just cheating yourself, you don’t actually learn anything if you breeze through with ChatGPT. First of all, economists know better than to moralize against the tide of incentives. But more importantly, this is a basic misunderstanding of the value proposition of the university. The university is not in the teaching business, it’s in the certifying business. If AI makes it easier to get that certification, a good student who doesn’t use it is at a disadvantage compared to worse students who do.
  2. Teach the tool. Lean into AI: just as we don’t expect students to do longhand division or remember the spelling of obscure words, it’s silly to be a Luddite and try to prevent students from using something they’ll probably have access to at work anyway. This approach misunderstands the sheer scope of what LLM chatbots can substitute for. The specific skill of longhand arithmetic is less important than broader aptitudes, which could still be reliably evaluated in a variety of different ways even post-calculator. From a student’s perspective, chatbots do not merely represent a reallocation of effort away from tedious aspects and toward more productive endeavors; they threaten to substantially reduce the total effort necessary to get a degree.

AI is therefore not a competitor to higher education, but it doesn’t clearly augment it either. Instead, the real threat is that it’s no longer possible to reliably distinguish good from bad students on any assignment with internet access.

This is an existential threat to the university. For a while, things will look good. Students who otherwise wouldn’t be up to snuff will decide that college isn’t that effortful after all, and classes will fill. But – as we say in economics – solve for equilibrium. Does a diploma post-AI mean the same as a diploma pre-AI? Will an employer be willing to pay that much more for an employee with a diploma, compared to one without?

And if the wage premium falls, it’s the good students who drop out before the bad students. The university enters a death spiral, and there’s no constructive student-facing role for it in the AI era.

The good news is that as long as AI doesn’t collapse the difference in performance between a diligent employee and a less-than-diligent employee, such a certification will continue to be valuable. The question then becomes: will the university – or any other institution – be able to reliably distinguish at lower cost than the eventual performance difference?

The Reverse-Goodhart: A Classroom Suggestion

As a certification, a diploma is a summary of many individual certifications represented by grades in individual classes. It’s at this level that AI is diluting the signal value of an education, and it’s at this level that the problem has to be dealt with.

I’ve noticed over the past few years in my classes that homework grades have risen about 20 points on average compared to when I started teaching. Incredible! But test grades are flat to slightly down. Not only that, but the students who do well on the homeworks are no longer the students who do well on the tests.

That tells me homework is a victim of what’s called Goodhart’s Law. To put it more intuitively than its usual formulation, Goodhart’s Law says that when there’s an informative signal (like a diploma, or homework), and when something important is conditioned on that signal (getting a job, or a good grade), people look for ways to get the signal without putting in the work. “Cheating the system”. And if they succeed, the end result is that the signal doesn’t tell you anything: the diploma doesn’t mean you’re smart, and getting an A on the homeworks doesn’t mean you’ll do well on the test.

A Goodharted signal is worthless unless the signal can be made to stay ahead of the cheaters. Quite simply, AI has already Goodharted homework, and is well on its way to Goodharting diplomas too. All the moralizing in the world can’t hold back the Goodhart tide.

But the flipside of Goodhart’s Law is that if we remove the stakes from a signal, it takes away the incentive to cheat the system.

So my policy going forward is that homework is optional. I’ll still assign it every week, and I’ll strongly recommend doing it as if it were assigned. I’ll give feedback on everything anyone turns in. But if you’re just going to feed it to ChatGPT, save us both the effort. The only thing I’ll actually assign a grade for is things that I can verify are done in class with students’ own brains (this includes classroom participation). And anyone who comes in for the test without having done the homeworks is guaranteed to fail the tests.

This approach significantly raises the stakes of tests. It violates a longstanding maxim in education, that successful teaching involves quick feedback: frequent, small assignments that help students gauge how they’re doing, graded, to give them a push to actually do it. “I’ll do it later” very easily turns into “oops, I never got around to it.” We’ve all been there, and I have a lot of sympathy for that.

Unfortunately, this conventional wisdom is probably going to have to go. If AI makes some aspect of the classroom easier, something else has to get harder, or the university has no reason to exist.

The signal that a diploma sends can’t continue to be “I know things”. ChatGPT knows things. A diploma in the AI era will have to signal discipline and agency – things that AI, as yet, still lacks and can’t substitute for. Any student who makes it through such a class will have a credible signal that they can successfully avoid the temptation to slack, and that they have the self-control to execute on long-term plans.

So my purpose in writing this is twofold: first, for the benefit of my students to communicate to employers that passing my class is a meaningful signal in this specific way. And second, because the signal value of a diploma (and therefore, indirectly, the wage of a professor) is averaged over the quality of many many classes, to convince other professors to think carefully about the grounds on which they can maintain their comparative advantage in distinguishing valuable skills even in the AI era.

I’m confident that the university can find a useful role in the AI era. Whether it will depends on us.

The header image was prompted with the cloud having “the wrong number of fingers, in the manner of early AI generated images” for the meta-joke, but in an interesting regression it seems AI has lost the ability to generate screwed up hands.

Footnotes

  1. Unlike some accounts that try to distinguish the ontology of normativity in general from morality in particular (e.g. Machery & Mallon 2010), we regard the boundaries of a “moral” domain within the domain of normative rules, to the extent one is distinguished, to be culturally defined. The paper will sometimes use ‘moral rules’ metonymously for normativity in general.
  2. While there are some examples of nonhuman animals rewarding or punishing certain behaviors of conspecifics (thus implicitly assigning those behaviors a valence) (Boehm 1999), and some minor examples of cultural transmission such as birdsong or tool use (Whiten 2019), we know of no other animal that transmits normative understandings in an open-ended way.
  3. While “noble lie” arguments have something like this structure, they are not – and indeed cannot be – moral or normative arguments, at least not for the intended believers of such lies. The lie itself, of course, will indeed usually be moral or normative, though the converse (that moral or normative claims are lies) is not true.
  4. There are signaling models where ignoring one’s own self-interest brings benefits sufficiently large to outweigh the potential costs. For example someone who keeps strict accounts of favors owed by friends will have fewer friends than someone who doesn’t “count the cost”, and the increase in the number of cooperative friends might outweigh the costs of exploitative friends. Thus even self-sacrifice can be “ultimately” reduced to narrow self-interest, at least probabilistically. Whether these models are stable over long enough periods against the exploitation of such signals is doubtful without additional assumptions, however (Harwick 2026b).
  5. This argument does not necessarily preclude the Effective Altruist concern with global welfare. What it does suggest is that, in order to accomplish its ends more effectively, a top priority of EA communities should be directing cooperation within EA communities. The 2022 controversy over the Effective Venture Foundation’s decision to purchase a manor house for its own events – with resources that could have purchased malarial bed nets for distant children – suggests that this concern is largely not on EA communities’ radar, although ‘longtermists’ (e.g. MacAskill 2022) are a partial exception.
  6. Popper’s (1945) famous ‘paradox of tolerance’ pertains to this point. The virtue of toleration is an obligation to the community sharing the value of tolerance – which may of course encompass many “thicker” communities – and does not sensibly apply across the boundary of the liberal community.
  7. It is also worth noting that a similar distinction obtains between the universal language faculty and particular learned languages. This is more than a suggestive parallel, as the culturally learned assignment of valence to actions likely rests on the symbolic capacity of a language-possessing species.
  8. The parable of the Good Samaritan is often taken today to entail a mandate for universal altruism (Francis 2020). Historically however, the Church has interpreted the parable as a special obligation to those one is physically near and able to help (e.g. Augustine 397, I.28).
  9. This does not necessarily entail the complete extinction, or even the quantitative diminution, of a weird population considered as a whole. Imagine a community with a distribution of commitment to modern weird cultural norms. If environmentalism preferentially “takes” among the upper tail, which then adopts antinatalism, the trait value of the entire population shifts away from weirdness, and therefore also propensity to care about the natural environment.
  10. Specifically, while agents are themselves objects, there is no way to create user-defined objects.
  11. A complete and up-to-date API reference can be found at https://helipad.dev.
  12. The checkentry is the only parameter type with an open-ended value range, and thus cannot be swept. All other parameter types have finite value ranges.
  13. https://github.com/charwick/helipad/tree/master/sample-models. Some are also available as Jupyter notebooks: https://github.com/charwick/helipad/tree/master/sample-notebooks
  14. Although the tendency in classical game theory is to establish the existence of a Nash equilibrium nonconstructively – that is, without an explicit algorithm mapping from inputs to an action – to deny the feasible computability of an equilibrium would effectively foreclose the use of game theory for behavioral prediction (Daskalakis et al. 2009).
  15. The following whirlwind tour of partial explanations for cooperation draws on Harwick (2020), which elaborates on all of these.
  16. Even action whose intention and effect is the production of symbols – say, speaking, or typing a paper –  nevertheless consists at the lowest level of nonsymbolic motions of jaw, larynx, fingers, etc.
  17. Even classical signaling models stipulate a closed domain of observation and action, even though the process of mapping between the two is endogenized.
  18. This argument is broader than just semantic mapping; it applies to any signal with positive leverage (i.e. that provides more benefit than it costs to produce). Even at the neurological level, to the extent there exists a fixed and well-defined mapping from neurotransmitter levels to actions, endoparasites can modulate the former for their own benefit (this is how parasites like hairworms commandeer insects to kill themselves as part of their reproductive cycle). Hence in higher organisms at least, where sufficient metabolic capacity exists, such pathways are extraordinarily opaque and byzantine, far beyond what would be necessary for the basic problem of internal signaling (Del Guidice 2019).
  19. At an even lower level, sensory streams are not just interpreted by the brain; they arrive at the brain already highly interpreted (that is, classified) by neurons in the eye itself (Lettvin et al. 1959).
  20. Nozick’s example compares a payoff matrix having columns “Horse 1 wins the race” and “Horse 2 wins the race” with a matrix having columns “The horse I bet on wins” and “The horse I bet on loses”, and constructs an example where the expected value of betting on horse 1 is higher than that of horse 2 in the first matrix, and lower in the second matrix. Rubinstein (1991, §6) also discusses an example where adding the possibility of an irrelevant action like player 1 disposing of one dollar before the start of a battle of the sexes game can perturb the partition of strategies in a way that results in the Nash equilibrium favoring player 1.
  21. Actual biological neurons – especially in the human brain – are somewhat more complex than this, allowing substantially greater computation to take place in proportion to neuron count compared to animal or artificial neural networks (on which see below) (Aizenbud et al. 2025; London & Häusser 2005). Nevertheless, as a model of input classification for the purposes of decision theory, and as an alternative to the Aumann axiomatization, a “pure” connectionist network suffices in principle. Recent advances in artificial neural networks also show such a model to be operationalizable in practice.
  22. Something like this is likely more literally true for more evolutionarily recent faculties. For example, infants undergo a babbling period during which the phonemes produced gradually come to match those found in the ambient language (Oller 2000).
  23. This is the epistemic status of the ideal type in interpretive sociology (Weber 1956), a central exemplar to which actual observations can be compared.
  24. The literature on complex adaptive systems (Holland 1992) bears a great deal of similarity to interpretive systems as described here, and indeed the human brain and the examples in the following section have been characterized as CAS. However, CAS are often modeled as an evolving rule set, where the rules may be algorithmic at any point in time. Interpretive systems as described here, on the other hand, are not merely evolving or self-updating, but classify in a holistic and nonsymbolic manner rather than as a sequence of if-then statements. We argue this is both a simpler and a more realistic idealization than the self-updating CAS in light of the strategic limitations of algorithms.
  25. This argument is pursued at greater length in Harwick (2025b).
  26. See Devereaux & Koppl (2024) for a more formal treatment of a similar argument.
  27. Greif (2006), however, thinks of institutions as the creation of common knowledge resulting in a self-confirming equilibrium (Fudenberg & Levine 1993), rather than the result of common knowledge. This may weaken the informational requirements for an analysis, but it also excises the question of origin and nature from the study of institutions.
  28. This kind of input-output boundary is sometimes modeled formally as a Markov blanket (Friston 2013; Fields & Levin 2022).
  29. Rubinstein’s (1991) “perceptive interpretation of the notion of a game” is also an attempt to understand state space in a phenomenological manner.
]]>
5