Radimentary

"Everything can be made radically elementary." ~Steven Rudich

Training PhD Students to be Fat Newts (Part 2)

Last time, I introduced the concept of the “Fat Newt” (fatigue neutral) build, a way of skilling up characters in Battle Brothers that aims to be extremely economical with the fatigue resource, relying entirely on each brother’s base 15 fatigue regeneration per turn. This choice frees up stat and skill points to distribute evenly among offense, defense and utility. To illustrate, let’s compare two possible ways to build a bro.

The first brother is a Nimbleforged Cleaver Duelist, who wields the weighty and brutish one-handed Orc Cleaver. To successfully attack multiple times a turn, this brother needs sky-high base stats – attack, defense, HP, fatigue, and resolve – and a continuous investment of stat points to reach around 70 maximum fatigue. Furthermore, this build requires a specific suite of offensive and fatigue-recovery perks to function, such as Berserk, Killing Frenzy, Duelist, Cleaver Mastery, and Recover. Only the most seasoned of Hedge Knights and Sellswords can pull this build off consistently.

This brother also needs to stay alive in the thick of battle, so optimally he wears “Nimbleforged” armor, famed medium armors light enough not to eat up your fatigue pool and heavy enough to benefit from the Battleforged defensive perk. You might only drop a set of good Nimbleforged armor once or twice a campaign.

The second brother is a Fat Newt, who requires only 15 maximum fatigue to move and attack once a turn. Practically any brother – from Farmhands to Swordsmasters – who rolls high attack and defense can become a Fat Newt. By ignoring the fatigue stat, this brother can use those saved points to shore up weaknesses in HP and Resolve. And since he only needs so little fatigue, he can wear the bulky standard-issue Coat of Plates and wield the mighty two-handed Greataxe.

The Fat Newt also has a lot of slack distributing perk points. Instead of mandatory offensive perks like Berserk and Killing Frenzy, he takes Quick Hands (allowing him to swap to a Longaxe in a pinch to decapitate at range) and Fortified Mind (further bolstering his psychological defenses).

I want to make three salient points about the contrast between “Hero” type brothers like the Nimbleforged Cleaver Duelist on the one hand, and Fat Newts on the other:

  • Heroes are one in a thousand, and Fat Newts are one in ten. Nimbleforged Cleaver Duelists require extraordinary luck and specialized gear to optimize. Fat Newts still have to roll well to function, but there is plenty of slack to shore up weaknesses, so they can basically be mass produced. If you want to build a company of twenty brothers, you cannot fill it with only Heroes unless you are willing to trawl through recruits for hundreds of in-game days.
  • Fat Newts are not just “budget” Heroes, and Heroes do not Pareto dominate Fat Newts. Heroes are stretched so thin that they have real weaknesses and require a lot of babysitting. Generally speaking, Fat Newts will have more survivability and more utility, and they can often act as a menacing offtank to hold key defensive bottlenecks in the battlefield. Their increased utility allows them to save teammates in sticky situations and play more varied roles as the situation demands.
  • The effectiveness of the fatigue stat scales in a complicated nonlinear way. A Nimbleforged Cleaver Duelist with 70 maximum fatigue can chew through a critical flank by himself, fighting on overdrive for four or five turns before tiring out. That time is often enough to decide the flow of the entire battle. The same brother with 35 maximum fatigue is much less than half as effective – he runs out of stamina on turn two, and then stares impotently as the enemy surrounds and overpowers his allies.

Primarily, my intention with this post is to convey a set of intuitions – derived from over three hundred hours of Battle Brothers – about what it might mean to be a working mathematician, Fat Newt style. 

Young people often learn by imitation and emulation, doubly so when they lose themselves in the maze of interlocking cults of personality that is academia. What ends up happening is that battalions of young mathematicians fixate on superhuman “Hero” types – Terry Tao, Peter Scholze, Andrew Wiles, Alexander Grothendieck and so on – mathematicians imbued with four or five standard deviations of intelligence, work ethic, and monomania, and try to “copy their build.” This turns out to be ineffective, maybe even surprisingly so.

I think there is an inarticulate mass delusion that might be called the “Half-a-Hero Trap.” Just being half as smart, and working half as many hours, as Terry Tao, and otherwise copying his behavior line-by-line, and one can hope to become a quarter as good of a mathematician. A quarter-Tao is still an impressive success, after all.

The real scaling laws here are much, much less kind. Setting the murky waters of intelligence scaling aside, let’s talk about productivity. One literal interpretation of fatigue is the number of productive work-shaped hours one can output in a week. If Alice has the capacity to work 70 hours a week, and Bob only 35, Bob is unfortunately much less than half as effective as a researcher. To make the point simply, if Alice and Bob both have 15 hours of teaching a week, then 70 – 15 is more than twice 35 – 15.

Even worse, the best mathematicians are hired to positions with the highest salaries, burdened by the least teaching responsibilities, at universities with the easiest students to manage. Think about the difference in research output between a mathematician who natively works 70 hours a week and only teaches the advanced probability seminar once a year, and the same mathematician who can only work 35 hours a week and teaches three sections of freshman calculus every semester. The difference in available research hours is staggering. The former flourishes and the latter stagnates.

As I wrote in Gravity Turn, the work of getting into orbit is categorically different from the work of staying in orbit. I propose that in a world where almost every PhD student is falling into the Half-a-Hero Trap, there are vastly superior models of skilling up – analogous to the Fat Newt build – that do not look like “imitate the nearest Fields Medalist.” Let me give two examples.

First, time management. Many students who are only capable of working 35 hours a week imitate the outward behavior of mathematicians who work 70. They go to the same number of seminar talks and conferences, spend the same amount of time teaching and grading, and attend the same number of departmental social activities. The well-meaning professor councils his student to attend five hours of class and five hours of seminars a week to broaden her horizons, oblivious to the sheer fraction of her meager productive hours this sucks up. I suspect this category of error is a font of bad decisions for graduate students.

Second, self-reliance. Just as the Cleaver Duelist may be able to jump into battle alone and mow down his enemies (though even for him this is a dangerous gamble), great mathematicians are often cast as lone geniuses, operating far outside of the capacity and understanding of their peers. Fat Newts, on the other hand, operate best in the middle of the fighting line, holding key positions and working in tandem to control and overwhelm important targets that they would be unable to handle alone. There is a whole separate post to be written about this, but briefly, I think that PhD training systematically overlooks building the skills needed to play a supporting role in a research team.

I must end on a sobering thought – even in the best-case scenario, the Fat Newt style is not a magic bullet that makes every recruit useful. In Battle Brothers, only one in ten brothers are generated with the stats to become an acceptable Fat Newt. My observation is that there are many graduate students, who are not generationally talented, who can only genuinely work 10 or 20 hours a week, if that. For them, I see no clear path forward.

Training PhD Students to be Fat Newts (Part 1)

Today, I want to introduce an experimental PhD student training philosophy. Let’s start with some reddit memes. 

Every gaming subreddit has its own distinct meme culture. On r/chess, there’s a demon who is summoned by an unsuspecting beginner asking “Why isn’t this checkmate?” 

These posts are gleefully deluged by responses saying “Google En Passant” in some form or other. Here’s my favorite variant:

Battle Brothers is an indie game of the turn-based strategy variety about keeping alive a company of muggles – farmhands and fishermen, disowned nobles and hedge knights, jugglers and pimps –  in a low-magic fantasy world filled with goblins, zombies, and dubious haircuts. 

Let me narrow down the kind of game that Battle Brothers is. It is sometimes said that all games are menus or parkour.

Battle Brothers is squarely a game of menus, a game of managing spreadsheets which happens to have graphics.

This is the skill tree for just one out of twenty brothers in a company.

The Battle Brothers subreddit has its own dominant meme, which is a little more mysterious than “Google En Passant.” Introducing … the Fat Newt.

Fat newt is a loving bastardization of the “fatigue neutral” build, which is a way of building brothers to minimize fatigue usage. Fatigue is the stamina/mana resource in this game, and attacking once costs about 15 fatigue. Normal brothers recover 15 fatigue a turn, enough to swing their weapon exactly once.

A trap that the vast majority of new players fall into, is to spend too many stat points leveling up fatigue on every brother, in order to build protagonist-energy characters – fencers, berserkers, and swordlancers – who can afford to attack two or even three times a turn. By spreading stats out, these fatigue-intensive builds are extremely demanding, requiring gifted brothers born with extraordinary talents. With all the points that should have gone to defenses and accuracy invested in fatigue, these would-be heroes meet their ignoble ends in the digestive tracts of Nachzehrers and Lindwurms as soon as they miss one too many attacks.

Only one-in-a-hundred brothers have the native talent to be a real hero, dancing across the battlefield like a murder of necrosavants. The community meta that has developed in reaction is the extremely counter-intuitive Fatigue Neutral build, a build that completely ignores the fatigue stat to pump everything else up. You rely entirely on the brother’s base fatigue regeneration to swing only once a turn. In exchange, you get to wear the heaviest armor, wield the biggest axe, and take all the defensive and utility perks that you want. Most importantly, with all this extra slack, while only one in a hundred brothers have the stats to be a hero, one in ten brothers can be a great fat newt.

My first companies were ragtag teams of wannabe heroes, who cut through easy fights like chaff but then got slaughtered in reciprocity when they faced the first real challenge. Then, I did some research and learned the gospel of the fat newt. 

Nowadays, my teams are usually built around the same solid foundation: an impenetrable fighting line of four to six Fatigue Neutral brothers who can stand their ground and decapitate once a turn, supplemented by a few elites and specialists. To my knowledge, Fat Newts are the most salient example of a build defined not by its strengths, but by its weaknesses. They highlight the possibility that under the right conditions, optimizing is primarily about choosing the right dump stat.

Next time, we operationalize the notion of training PhD students to be Fat Newts…

Continuity

I left China at the age of four. My memories of those first four years are scattered impressions: a three-leaf clover I chowed down on while Mom’s back was turned, the smell of a revolting herbal remedy, the time an older girl scratched me on the cheek in daycare.

We went back to visit every several years. The summer before college, I visited my birthplace, Chengdu, to see my grandparents. At some point, we had dinner with a larger group of family friends. Two of the children at that party, as it happened, had been my best friends in day care.

They remembered me. They remembered the daycare we went to, and the street it was on, and the ways our parents were connected. They told me about the group of boys who’d roughly all grown up together: one who excelled in school and was going to Peking University, another who went too deep into League of Legends, another who was currently obsessed with The Three Body Problem. They were so warm, inviting me back home like an old friend who’d always belonged. I was immediately one of the boys – they asked me to translate words they’d heard in American movies, and snickered at the definition of “asshole.” 

To them, I was a thread that had flown off the tapestry of their lives. They picked me right up, dusted me off, and sewed me back in. 

I didn’t remember a damned thing about them – I didn’t even know I had friends in daycare.

After arriving in the US, my family toured the M-states: I moved from Missouri to Maryland at six, and then to Massachusetts at ten or eleven.

Missouri is also barely a splash of impressions: climbing chestnut trees, encountering a proto-psychopath on the schoolbus, sitting for hours “helping” my father fish the Ozarks.

In Maryland I have more substantial memories: holding hands with my best friend William before we learned it was not cool, and then finding out why it was not cool. I remember riding my first bike and then having it stolen by the older kids upstairs, and watching my mom sneak out at night to steal it back. I remember the face of the nasty teacher who gave me a C just to put me in my place. 

My memories come alive right around the time William introduced me to Diablo II. Although neither of our parents allowed us to play more than a couple hours a week, we spent many hours theorycrafting and poring over the official strategy guides. For many years after, I’d boot up Diablo just to recapture that time.

With the benefit of hindsight and the theory of spaced repetition, I understand now why I remember so little of those early years, and why only Diablo remains as fresh as yesterday. After I left China, my daycare chums in Chengdu passed the same streets, met the same elders, played with the same classmates month after month, year after year. Their memories of early childhood were reinforced again and again. They could easily triangulate even my minor, brief role in this world. The brain remembers those patterns that are repeated across time.

I had no such luck. Every few years, the world was switched out by an entirely new stage, with an entirely new cast. There was a surjective function from friends I held dear to days for saying goodbye. For others, life was a single, cohesive drama; for me, it was a series of improv scenes. It is no wonder that my memories are so scattered.

Math majors and PhDs often ask me how to decide between academic and industry jobs. Broadly speaking, these conversations have a common dramatic structure: the student lobs a bomb at me in the form of a mad lib:

Compared to academic jobs, industry jobs are 10x easier to find, pay 10x better, demand half the workload and half the red tape, BUT __.

My job in this drama is to defuse the bomb by filling in the blank with a single intangible value or principle – academic freedom, say – so beautiful that it overwhelms all practical considerations and justifies all the tragedy of academic existence. Some students hurl the bomb at me aggressively – in their heart of hearts they are already checked out of the academy and are looking to verify that the ivory tower is full of shit. Others hand me the bomb timidly, because they are romantics and martyrs at heart – with their eyes, they plead with me to half-ass the answer with anything remotely persuasive. They need something sacred to whisper on their lips as they throw themselves onto the cross of the academic job market.

I’ve always disliked this conversation, until now. I finally know how to fill in the blank, at least in a way that would have persuaded my past self:

Compared to academic jobs, industry jobs are 10x easier to find, pay 10x better, demand half the workload and half the red tape, but continuity.

I’ve been starved for continuity most of my life. My family moved when I was four, and then six, and then ten. Then, I went to college, did a PhD, did a postdoc, and finally landed a tenure-track professorship, moving seven times in 31 years. Seven times the stage was reset and the cast replaced.

How many more would it be if I go to industry? Everyone is moving, all the time. Startups collapse, or are acquired. Entire organizations are shuffled and reshuffled when new directives are delivered from on high. In many places, the best way to get promoted is to jump ship and be hired at a new level. One day, you’re shooting the wind with the coworker at the next desk over. The next day, the desk is empty.

What do I mean by continuity? The great cathedral of Notre Dame began construction in 1163 and was completed in 1345. Continuity is what I imagine being involved in that project was like: your father, and his father, and so on four generations back, all toiling towards a common cause, a single continuous sacred labor, that ties together every aspect of your life.

I completed my PhD in 2021.

My PhD advisor, Jacob Fox, completed his PhD in 2010. Around half of my research projects come from problems Jacob started thinking about more than a decade ago. I see him practically every year at conferences, workshops, or research visits. He is someone I can trust for advice about anything from career development, to research taste, to advising students. 

Jacob’s PhD advisor, Benny Sudakov, completed his PhD in 1999. Benny is a legendary PhD advisor who has trained and continues to train many outstanding mathematicians. This past summer, I raced Benny in the Random Run, a long-standing tradition of the biennial Random Structures and Algorithms conference. On a standard track, the number of laps in the run is determined by the roll of two dice; the second die is only rolled when the front-runner finishes the first set of laps. In the advisor-student pair category, Benny and his student Aleksa edged out my student Ruben and me for the win. In two years, I hope to be in better shape.

Benny’s PhD advisor, Noga Alon, completed his PhD in 1983. I received Erdős number 2 by spending 2021-2024 as a postdoc working with Noga, who is still sharper than any of us. Together with Joel Spencer, Noga wrote the textbook The Probabilistic Method which I and many others use to train PhD students. Joel has a fun tradition of publishing photos of young children reading The Probabilistic Method on his website. There is a picture of Jacob’s daughter there, as well as one of Noga reading the book to my six-month-old.

This is just one thread of a densely woven tapestry, a community of combinatorialists that traces itself back continuously to the problem-solving circles of Paul Erdős and his university buddies in Budapest. Our story is, I think, not dissimilar to that of the builders of Notre Dame. 

Erdős rolled the dice for the first Random Run in 1983. I pray the dice continue to roll for many years hence.

Dipole Nature

[I am delighted to be visiting Inkhaven for the next four days and will attempt to post every day I’m here.]

There is a Chinese tradition for using water as a metaphor for human nature.

人往高处走,水往低处流
As water flows downwards, people climb upwards. ~ proverb

Here the nature of water is that it inexorably follows the laws of gravity downwards, just as humans follow their incentives. It’s worth noting that social climbing has a negative connotation in modern English; it has a positive valence in this proverb.

水滴石穿
Dripping water penetrates the stone. ~ proverb

Here the nature of water is to be stubborn, patient and relentless, burrowing through even solid rock over centuries.

Be like water making its way through cracks. Do not be assertive, but adjust to the object, and you shall find a way around or through it. If nothing within you stays rigid, outward things will disclose themselves. Empty your mind, be formless. Shapeless, like water. If you put water into a cup, it becomes the cup. You put water into a bottle and it becomes the bottle. You put it in a teapot, it becomes the teapot. Now, water can flow or it can crash. Be water, my friend. ~ Bruce Lee

Here, the nature of water is to be flexible, lightfooted, and agile. Attachment to a fixed solid form makes one vulnerable and brittle. Flowing around obstacles like liquid is the way to resilience.

Given that we are more than 50% water by mass, these metaphors are quite natural. I am here to propose yet another way humans are like water, with the benefit of a little high school chemistry.

A Bit of Remedial Chemistry

I remember AP Chem as a whirlwind of “experiments,” some officially approved, others less so. We hid in the back of the room playing phone games, flicking our fingers through the blue flame of the Bunsen Burner, and dipping our tongues into unknown solutions to test their acidity. My favorite insight from that class derived not from any fancy spectrum or iridescent reaction, but from a simple experiment about holding a charged object near a stream of water (imagine dripping it out of a pipette).

Recall that in H2O, the two H’s tend not to stand exactly opposite each other but in a slightly obtuse triangle with the O, and O is famously electronegative. So, the O end is charged slightly negative, and the H end positive. This is called an electric dipole, an molecule with two differently charged parts that has no net charge altogether.

What happens when a positively charged object is held next to a moving stream of water? Answer: water bends towards the positive charge.

What if instead the object is negatively charged? Answer: water bends towards the negative charge. Electric dipoles are attracted to charged objects, regardless of the charge.

How is it possible for a substance to be attracted to be both positive and negative charges? This blew my mind at the time – it violated all the naïve intuitions I’d developed in my twenty minutes contemplating Coulomb’s Law.

Here’s what happens when water stands next to a positive charge.

The positive charge attracts each negative pole, so all the water molecules turn to face it with their negatively charged O side. But now, after they turned, the O’s are a smidge closer to the positive charge than the H’s – so by Coulomb’s law the attraction to each O overwhelms the corresponding repulsion to the H side. Thus, water, though itself uncharged, is (slightly) attracted to positive charges!

What happens if the sign is flipped on the big charge? Exactly the same local dynamic, except the O’s face away now. If the big particle is made negatively charged, all the O’s turn around, and again there is a net attraction towards the charge. This explains why water is attracted to charges of both signs. In some environments, water acts like a positive charge, in others, negative.

Exercise: what happens when liquid water is replaced with solid ice?

Humans are like water: we have dipole nature. Most of us wear different faces in different company, and that leads to different interactions. Does this mean that we are necessarily two-faced and lacking integrity? Maybe so, but it is worth being very careful about the connotations here. At least in chemistry it is possible for a single internally consistent particle to behave like a positive charge in one environment, and a negative charge in another.

Case Studies

One. When it comes out that a politician, executive, or professor has been abusing their underlings, colleagues and supervisors (i.e. equal and higher-status folks) often come out of the woodwork in their defense: “He was such a lovely guy!” “I couldn’t imagine him being an abuser!” “You must be exaggerating, or lying, or have done something to deserve it!”

We now have common knowledge of the category of error made by these well-meaning defenders: they mistake dipoles for monopoles. Just because he is nice and collegial to you, doesn’t mean he is to everyone.

But does that make the abuser an aberration, a violation of the laws of normal human nature, a scheming, Machiavellian psychopath infinitely beyond the comprehension of Hobbiton? Perhaps not – perhaps dipole nature is just a default behavior for human particles.

Two. Mia gets upset when her childhood friends change. When they start earning adult incomes, their lifestyles inflate. Her friends learn to wield power, and lord it over others. In the presence of the powerful, the famous, and the sexy, they become unprecedentedly obsequious and subservient.

Mia complains to me about all these things. About losing friends, losing faith in humanity, and even losing confidence in her own principles.

I tell her these things should not be surprising, this is how humans are.

She accuses me of being too cynical, to believe that all humans are evil wrapped in a Reese’s cup paper of respectability.

But that’s not how I see it. Humans have dipole nature, and the existence of the negative pole does not invalidate the authenticity of the positive. Her same childhood friends, who are now airheaded clout-chasers and corporate fief-lords, if supplanted into the right environment, might yet turn their poles right round and face the world with the same integrity and wholesomeness Mia remembers fondly from grade school.

Three. I forget the source, but one theory of writing good books is to first write a great sentence, and then fill in the rest of the book around the sentence. One of my favorite pop-psych books, “Sorry I’m Late, I Didn’t Want to Come” by Jessica Pan, is a touching story about overcoming introversion. There is an especially fantastic sentence inside, perhaps great enough to carry the whole book on its own:

Nobody waves, but everybody waves back.

In other words, positivity and pro-social behavior is easily elicited by an initial gesture of goodwill. If you live in a world of dipoles, it is useful to learn to elicit their positive poles. In Pan’s book, this means passersby will freely wave and smile back if you do it first. In math research, this means strangers rarely invite me to initiate new collaborations, but if I make the invitation, they almost always accept.

In a World Full of Dipoles, Be a Monopole

While dipole nature is understandable, comprehensible, and not a bit reprehensible, there is another, rarer way of being. There are those happy few who have monopole nature, who wear a single, immutable face regardless of the company. In a world full of dipoles, the whole social environment reorients itself around monopoles.

There are negative monopoles, of course.

That aunt that criticizes every meal and calls every child fat, in private and in public. Everyone tacitly dislikes but tolerates her for the sake of family. At Thanksgiving, the whole family walks on eggshells around her to minimize the electric potential and keep the peace.

That aggressive, spoiled brat in your son’s kindergarten class who somehow still gets invited to every birthday party. When he comes to the parents’ table to demand a third slice of cake, he is greeted by a sea of smiles. But those smiles do not reach eyes.

There are also positive monopoles, folks who are relentlessly pro-social and never bend the knee. Those described rightfully by words like “courage,” and “integrity.” We all have our personal heroes who embody these traits – I will not pollute your image with descriptions of mine. If the first step is to come to terms with our own dipole nature, the next step is to aspire to a positive monopole nature.

But it must be said that even to verify whether one has monopole nature is not easy – it is not enough to check that in this one electric field, the molecule moves like a positive charge. This is why the heroes of the best novels are tested by many ordeals, of a variety of polarities.

Humans are like water; we have dipole nature. I hope this is as useful a metaphor for you as it has been for me.

Optimizing Looks Weird

[Calling it for this November, happy to have gotten a few short posts out.]

For a couple years in my childhood, my mom picked up an obsession that can only be described as extreme couponing. At the time, CVS and Rite Aid offered an enormous variety of discounts and rebates with strange, time-limited conditions and a glaring loophole: most of them stacked upon each other. If you rolled into the store on the date of the correct sale with a hundred dollars worth of coupons and rewards dollars, you could buy out their inventory of certain products without spending a cent, and end up with more ExtraBucks than you’d started with. And so it came to pass that at regular intervals, I’d be called out to the parking lot to help my mom haul in a twenty-year supply of Oral-B toothbrushes or a trunkful of sour cream and onion potato chips. At dinnertime, we’d inevitably be regaled with the story of yet another indignant cashier who called a manager after my mom pulled out her folder of coupons, only to be forced by said manager to apologize to the customer.

Genuinely optimizing looks weird and transgressive.

In StarCraft, there is a trick that every beginner Zerg player learns called the “extractor trick.” The game imposes a cap on the number of soldiers you can build, which is a very severe constraint. The extractor trick lets you surpass this cap by (roughly speaking) manually killing your own soldiers, building new ones, and then resurrecting the dead.

Feynman was famous for – among other things – his method of learning by teaching. I think what he noticed is that much of our collective brainpower is locked inside social cognition, and his method is a way of coopting this inaccessible processing power to learn math and physics. Nowadays, Feynman’s method is common practice: in graduate reading seminars everyone signs up to give a lecture on a topic they know nothing about; in many math classes, the professor learns as they go and stays one week ahead of the class.

Noticing the loopholes in the rules requires curiosity and confidence in ones own faculties, but that’s only half the battle. Many people notice loopholes and exploit them a tiny bit, like walking out of the convenience store with a free bag of chips. The rest of the battle is how you turn this exploit into a method, a career, or a business: it requires the courage to go all in to exploit these loopholes as far as they’ll go.

Aggro is the Foundation

Today I want to review a concept present in many domains, but most clearly articulated by TCG players. In a game like Magic the Gathering or Hearthstone, gameplay divides neatly into two phases: deck-building and execution. Deckbuilding involves all the choices and calculations that go into preparing your custom deck of cards before you even sit down (or log in) to draw your first hand. Execution is playing your deck, and your particular draws, as well as possible in the moment.

Execution is hard, but essentially learnable. Deckbuilding is the truly difficult and creative part: it requires not just extensive game knowledge and creativity, but a deep understanding of the metagame – what decks other players are likely to bring and how to counter them. One core philosophy that I learned from great deckbuilders is the understanding that aggro decks – fast, simple decks that try to kill the opponent as quickly as possible – are the foundation upon which the entire metagame is built. Of all the kinds of decks, building an aggro deck is the least difficult; usually your choices are limited to cheap, efficient early-game cards that end the game as quickly as possible, and there isn’t a huge amount of room to optimize for metagame.

In contrast, other decks (typically classified as “midrange” or “control” decks) need to be built contextually with aggro in mind – the dominant aggro deck in the metagame sets a pace you have to match. If the fastest aggro deck can ends the game in three turns, then you need to have cards you can afford to play in the first three turns. If they play a lot of minions on the board, you need a lot of removal to kill those minions. If they play many damage spells, you need counterspells or healing. In a sense, aggro is the foundation upon which the other layers of the metagame are built in layer upon layer of abstraction.

Many other games have the feature that there are a few pure aggro strategies – which is not necessarily even a good strategy – upon which any deep understanding of the game must be built. In Starcraft and other RTS games, the rush strategies that are possible dictate the pace of the game: if the earliest enemy rush can come at 3 minutes and 30 seconds into the game, then your strategy must make sure to start building defenses at 3 minutes (or at least leave the possibility open conditional on scouting information). In real life geopolitics, war, especially nuclear war, is the foundation: even if no military conflict actually happens, military strength must factor into negotiations at every higher level of consideration. In contrast, the execution of nuclear war is straightforward and unfettered by metagame considerations.

I write all this to suggest that there is also a basic aggro strategy in mathematics research upon which all other metagame considerations must be built, and that strategy is the simplest one of working on a hard technical problem by yourself. You will be told to spend lots of time going to learning seminars and talks, you’ll be told to network and hobnob, and you’ll be told (by me, in the very last post) to attend to meta-considerations and learn how to play a supporting role in research. All of this, however, must rest upon a solid foundation of knowing how to play aggro, how to carry out productive research. Without this foundation, you’ll have no idea what to focus on in a seminar, not a clue what to learn from and ask of other researchers, and only a fuzzy model of what support you can provide to another mathematician trying to carry out their research.

Just as not every Magic player plays aggro decks, not every mathematician needs to do solo research to be effective. But every Magic player needs to know what aggro decks are in play, and every mathematician needs to know how to do solo research. That’s the foundation upon which the entire metagame is built.

Research in Tandem (Part 3)

Today I want to end this discussion about research collaboration with my most useful tip for grad students: build an explicit model of how collaborators work, especially your PhD advisor.

One of your primary goals in graduate school is to set aside 20% of your brain for simulating your advisor, who is typically the best mathematician you are in close contact with. Learn and imitate their reflexes, their tastes, their decision trees. Spend substantial chunks of time during research meetings being curious about minds and modelling how other mathematicians operate. How did they come up with this? What do they know that I don’t? Why did they try this approach first?

Even if this is the only thing you manage to do in grad school, you end up as a low-resolution clone of your advisor – which is not ideal but nevertheless a better-than-average outcome.

Here are seven points of inquiry to jumpstart your quest to model another mathematician.

Research Direction

  1. What problems do they work on?
  2. How do they choose these problems?
  3. How do they weight the important of a problem versus aesthetic interest in it, versus the actual likelihood of actually solving the problem?

Collaboration

  1. Who do they work with most frequently?
  2. What qualities do they praise about their closest collaborators? How is labor usually divided in their collaborations?
  3. By what criteria do they evaluate other mathematicians?

Your relationship

  1. What exactly do they want from you?
  2. Conversely, what exactly do you have to offer them?
  3. Most mathematicians are somewhat motivated by genuine care for young people, but there are pragmatic considerations beyond that. Can you help realize their mathematical vision? Do you carry out humble work that makes their life easier? Are you stimulating and enjoyable to be around?

Patterns of thought

  1. What patterns do you notice in their thinking over time?
  2. What are their common first refrains when working on a problem?
  3. Which pictures, techniques and lemmas do they rely on time and time again to orient themselves?

Weaknesses

  1. What are their glaring weaknesses?
  2. From where you’re standing, are these weaknesses gaps that you can fill, or dump stats that you should deprioritize as well?
  3. Do they ever advise you “do as I say, not as I do”? How seriously should you take such advice?

Origins

  1. How did they get started in math?
  2. Getting into orbit requires different strategies from staying in space; what did they do at the start of their own career?
  3. What mathematicians did they themselves admire and learn the most from?

Work-life balance

  1. What is their working life like?
  2. How much time do they spend on teaching, traveling, and administrative nonsense?
  3. Would you actually want to work a day in their shoes? If not, what would you adjust to make it ideal for you?

Research in Tandem (Part 2)

Today we continue our discussion about research meetings with a few concrete strategies. When I was a new graduate student, I often had long meetings with my PhD advisor and other professors that went completely over my head. Such meetings are extremely demanding: they require a broad base of shared knowledge, they involve carrying out complex calculations and spatial manipulations entirely through verbal communication, and they proceed at a meandering conversational pace that often jumps back and forth between many different approaches and perspectives.

The meta-heuristic underlying all of the following tips is: learn to play a supporting role. Every mathematician wants to be the genius who single-handedly carries the team to the finish line with insight after insight. In contrast, nobody tries to playing support. Therein lies an enormous well of untapped potential for you to contribute directly to mathematical inquiry without having the faintest clue what’s going on.

Take notes

Never be afraid to interrupt the flow of conversation to walking up to the blackboard (or pulling out paper or laptop) to draw pictures and note down what’s being said. Just copying down what others are saying might not seem like much contribution, but you’ll soon learn the many benefits this practice has.

You help catch mistakes and ambiguities that were skated by in conversation. Taking notes improves your long-term memory and learning. Having visible log for the history of the meeting frees up precious working memory for you and your collaborators to forge rapidly ahead. Writing things down forces you to develop evocative notation, useful pictures, and modular lemma statements that compress amorphous heuristics into concrete, versatile building blocks. As collaborations extend over weeks, months, and years, everyone will be thanking you later for keeping notes, however half-assed they may be.

If you have nothing to contribute, the first thing you can do is take notes.

Toss Bricks

The Thirty-Six Stratagems are a compilation of aphorisms for war and politics deeply engrained into Chinese culture, of comparable influence to the Art of War. One of my favorites is 拋磚引玉, which roughly translates as “toss out a brick to lure out the jade.”

If you’re stuck – and you often will be – instead of silently waiting for others to present good ideas, present your own bad ideas. Throw this brick out as a way of baiting insights – which are the jade in this analogy – out from your peers. It’s common knowledge that the fastest way to get a question answered on the internet is to post a wrong answer. The same heuristic applies to research: if a conversation stalls, throwing out a brick. Your collaborators will rush to point out all the reasons your approach is wrong and naïve, and how to improve it. Before you know it, beautiful pieces of jade will have appeared in its place.

There is an art to tossing the right bricks. I don’t suggest yelling out “Let’s try category theory” at every turn when there’s no connection whatsoever to the current problem. Best practice for throwing bricks is akin to semi-bluffing in poker: a brick is a hand that is currently useless, but there’s still a chance it might work out on the river. You probably have bad ideas and fuzzy intuitions you’re embarrassed to share that seem very slightly relevant to the problem. Just lower your filters and babble them out.

I can’t count the number of times I’ve opened my mouth to spew out a nonsense thought that didn’t even make syntactical – let alone logical – sense, only for one of my brilliant collaborators to charitably error-correct said sentence into a useful insight. “Ah yes, of course, that’s exactly what I meant,” is usually how I continue this conversation, “But just to be pedantic, could you explain that in more detail?”

If you’re not courageous enough to present a brick as if it’s a genuine insight, preface it with a disclaimer: “So here’s an idea that definitely doesn’t work, but I’d like to figure out why.”

To be continued…

Research in Tandem (Part 1)

I once heard the following story about Szemerédi: His daughter was in elementary school at the time, and her teacher asked everyone to share a bit about their parents’ occupations. The kids went around the room and each said a little story about what their parents did for a living: “My mommy is a doctor, she pulls teeth.” That sort of thing. When it came time at last for Szemerédi’s daughter to share, what she said shocked and worried the teacher: “My daddy just lies in bed and stares at the ceiling all day.”

There’s a stereotype that mathematicians spend all their productive time holed away from the world like this, staring at blank surfaces while intricate equations play out in their minds’ eyes. To the contrary, I find that a substantial fraction of my mathematical progress these days occurs during research meetings, which ideally go like this.

Two to four people sit in a room or Zoom call together, for a meeting slated to last an hour or two. We set our sights on a problem of common interest, and then start bouncing half-formed ideas off each other.

At times, we hit upon a tool or keyword that seems useful, and there’s a flurry of activity as everyone digs through their memories and Google Scholar for relevant literature. If this goes well, we find a relevant paper – invariably a paper about expander graphs – and the meeting devolves into a puzzle hunt where we collectively attempt to decipher the beautiful mathematics painstakingly hidden away in said paper. Eventually, we discover that fifty years ago a physicist solved a special case of our problem, formulated in an entirely different language, and the paper trail ends there.

Other times, one of us lets out a sigh and admits defeat, “This equation is way too difficult to solve, can we at least solve the toy problem where all of the functions are just constants?” This humble simplification draws a gasp of disbelief from the others, “That should be trivial, just apply lemma so-and-so and decomposition thus-and-thus.” We proceed to bully the most junior member of the team – typically a graduate student – into calculating decomposition thus-and-thus live on the blackboard. The idiosyncrasies of the problem turn out to be more intricate than we’d expected at first blush, and the nested summations soon get out of hand. It takes a good half hour before we give up on executing the calculation rigorously, all the time nodding to each other more convinced than ever, “Yes, the decomposition definitely should work. Although it looks a bit messy around here it really has to come down to iterating the Cauchy-Schwarz inequality.”

Finally, as we all agree to go to lunch and promise to check the calculation independently – trivial though it must certainly be – the guy in the corner who’s been silent all meeting finally pipes up, “I think I solved the original problem!” He goes up to the board, erases all the nested sums, and proves the theorem in two lines. Seeing the awe in our faces, he tries to comfort us sheepishly, “Well, I only got the key idea from watching the calculations you all were doing. When you wrote the letter `s’ in that curvy way, that’s what gave me the idea of using the Gauss integral.”

Next time, I’ll write about strategies for keeping research meetings productive…

The Fundamental Growth Curve (Part 3)

Last time we introduced the basic model of growth, which looks like this:

The thesis is that noticeable growth is typically punctuated by a long intermediate period of low return on investment, which we call “the Wall.” In the remaining posts on this topic, I plan to cover (a) the common failure modes that arise due to the existence of the Wall, and (b) prescriptions for how to minimize or completely skip over the dreaded wall.

Failure Modes

Skill and complexity creep

In the year that League of Legends launched, you could become a top player in three months of unstructured training by focusing on one hero and drilling mechanics. The level of play is always this low at the beginning of things. Nowadays, it takes years of dedicated practice and encyclopedic game knowledge to reach the same relative status in the game.

Creep is an ever-present threat to the health of every community of skill: consider the research field where the low-hanging fruit has been picked barren, the video game where the barrier-to-entry is a hundred hero by hundred hero table of matchup knowledge, or the industry where the pool of interview questions grows ever more esoteric and adversarial. Left unchecked, the wall grows higher and higher, until new blood stops bridging the gap altogether and the entire community dies out.

Picking the wrong-sized pond

As a child, my parents told me the typical Asian advice that I should befriend older, smarter kids from whom I had much to learn. Thankfully, I mostly ignored this advice. I’ve seen friends try to follow this strategy, usually to their detriment.

The farther down you start in the status hierarchy, the further away those sweet positional gains become. And imagine, god forbid, that you really imbibe this backwards advice and continuously try to jump up hierarchies to where you don’t belong. You’ll spend your whole life being the odd one out, the real impostor, the weakest link, the lowest-status grunt who is passed up for every opportunity and promotion.

Conversely, it is also possible to be too big of a fish in too small a pond. They say that if you’re always the smartest person in the room, you’re in the wrong room. There’s a different kind of stagnation that happens when you reach the peak of your local hierarchy and don’t search for greener pastures.

Bringing down the Wall

Artificial divisions

It is common knowledge that only children can become chess grandmasters. Neuroplasticity and ability to learn probably plays a role, but another important factor is that there exist long and delicate pipelines of positional gains for bootstrapping children through the Wall. For example, in tournaments and classes, young children are carefully subdivided into two-year age brackets and locality; this artificial partitioning of the population allows for that many more first place trophies to win and local mini-ladders for kids to climb. A seven-year old prodigy can start winning games in the county at the under-8 level with only a bit of talent and study, then the state level, then the next age bracket, and so on. The positional gains are thus paid to her in installments that keep her coming back. Adults who want to learn chess have no such luck.

Systems for training difficult skills can be optimized by placing people in granular divisions with comparable peers. Conversely, as an individual, one should judiciously hop between ponds to find places where fruitful positional gains are within reach. As a rule of thumb, the sweet spot seems to be rooms where you’re around the 75th percentile.

Hyperspecialization

One way difficult disciplines can prosper is by subdividing in a different way, specializing into mutualistic subdisciplines. A software engineering team might be a hostile, zero-sum competitive environment if everyone is trying to be the best at everything. But suppose the team members each leverage their unique strengths, and you end up with a one expert in frontend, one in backend, one who knows how to speak the voodoo language of customers and product managers, and that one machine learning guy. Suddenly everyone has the benefits of high status in their respective domain and access to mentors who can help patch up their weaknesses.

One problem math academia faces today, and part of why the wall called graduate school is so difficult to get over, is the insufficient specialization of labor. Sure, we specialize in subject matter, but whether you’re an algebraic topologist or a knot theorist, you still have to excel at research proper, paper-writing, mentorship, public speaking, etc. etc. Effectively, all these sub-dimensions of competence are projected onto a single massive meta-ladder that is impossibly tall (man am I mixing metaphors today) for the novice to climb.

Design a site like this with WordPress.com
Get started