Digithead’s Lab Notebook

Music 2025

2026-01-12T00:00:00+00:00

Me enjoying some tunes at the lab

The years are short but the days are long. Another year has come and gone at the waning end of a golden age. But still, there’s music.

Ready to queue up the 2025 playlist and read some super deep thoughts on music from the year just gone?

Wellington Jazz

Wellington has had a rough couple of years with the New Zealand government in an austerity phase. Maybe it’s like the cartoon character running in the air off the edge of a cliff, but the Welly music scene is going strong. Someone at the New Zealand School of Music is doing something right, judging by the quality of musicians with associations of one kind or another with that institution.

This year, when I wasn’t busy at the lab, I got out to witness talented Wellington musicians creating something. Most Sundays, you can start out at the Undercurrent Bookshop in the afternoon then head over to the Rogue and Vagabond for another gig.

This chart could probably act as a pretty good proxy for my mood during the months of 2025. As a public service announcement, November sucks in this part of the world.

Wellington Jazz Festival

The Wellington Jazz Festival had a strong year in 2025. The headliners were impressive, for a small town near el culo del mundo, but the local talent really shined. I went to shows at the Rogue, upstairs at Bedlam & Squalor, the Library, Undercurrent, Meow Nui and even the rooftop terrace of a law firm. But there was so much on that you couldn’t possibly see everything.

Next year’s Jazz Fest will be October 14-18. In the meantime, keep up with what’s on in Welly with their handy Gig Guides.

So much great music is lost into the air, never to be captured again, or so said Eric Dolphy. We should count ourselves lucky when some of the good stuff gets recorded. A cabal of Wellington Jazzers put out albums in 2025. It’s nice to see a group of musicians that play on each other’s albums, go to each other’s shows and clearly push each other further into inspired sonic territory.

Callum Allardice - Elementa

Released: January 24, 2025

Callum Allardice - composer, guitar
Luke Sweeting - piano
Tom Botting - bass
Hikurangi Schaverien-Kaa - drums

Composer and guitar player Callum Allardice released a quartet album in 2025 full of good stuff. That follows the 2024 release of big-band album Cinematic Light Orchestra completed during a composition residency at Victoria University of Wellington.

Callum played in several combinations during Jazz Fest that featured some strong but as-yet unreleased material, at least as far as my Googling skills can dig up. Song titles I managed to catch included The Curse, The Right Hand of the Blessed and The Left Hand of the Damned. I hope to hear recordings of these tunes someday soon. Sounds to me like they should be a suite - The Cursed, the Blessed and the Damned - but what do I know?

Allardice and Sweeting also played as part of Antipodes, along with Jake Baxendale (sax), Tom Botting (bass) and Tim Geldens (drums). Good Winter was released back in 2018 with slightly different personnel. This group played Jazz Fest, then did a New Zealand mini-tour that ended up back in Welly sounding tight after a few weeks playing together. We can hope some of that material is sitting on a hard drive somewhere waiting to be released.

Louisa Williamson - Groundwork

Released: April 17, 2025

Louisa Williamson is a sax and flute player, band leader, and composer of jazzy, soulful and funkified music. This year’s Groundwork follows the excellent What Dreams May Come, an ambient suite for jazz orchestra taking inspiration from Brian Eno and Maria Schneider.

Louisa Williamson - saxophone/flute/vocals
Kaito Walley - trombone (In Tune, Lake Glass, Lou Lou)
Callum Allardice - guitar
Daniel Hayles - piano
Johnny Lawrence - bass
Cory Champion - drums, percussion
Maarire Brunning-Kouka - vocals (In Tune)

Clear Path Ensemble - Black Sand

Released: May 15, 2025

The official blurb on this one is unbeatable: “Inspired by the deep listening ambient and jazz record bars of Japan, Black Sand continues with Clear Path Ensemble’s jazz-funk fusion sound while folding new elements of minimalism, ambient, techno and library music into a restrained, yet highly exploratory sound world.”

Cory Champion - drums, percussion, vibraphone, guitar, electric bass, rhodes, synthesizers
Johnny Lawrence - double bass, electric bass
Daniel Hayles - piano, rhodes, clavinet
James Illingworth - synthesizer
Louisa Williamson - flute
Mike Isaacs - bass clarinet

When this crew announced they were playing at the Begonia House, I was skeptical about having a show in a greenhouse in the Botanic Garden. Turns out, it’s a super cool venue and it was a great show. Even the plants were grooving.

Daniel Hayles - On the Grid

Released: October 20, 2025

Whether it’s solo piano or 17-piece big-band, I always leave a Daniel Hayles gig with homework. He’s great at turning up great material and making it sound amazing in sometimes radically different formats. Digging up the original when you get home is part of the adventure.

Daniel led a fantastic performance of the Other Futures Big Band doing a thing they describe as “turntablism sensibility on a symphonic scale” at Meow Nui. The super-diverse set list featured works by Steely Dan, Notorious B.I.G., Madlib, Mark de Clive-Lowe, South African pianist/composer Nduduzo Makhathini and NZ pianist/composer Jonathan Crayford.

Sylvester Green - Trumpet
Tyaan Singh - Alto Saxophone
Louisa Williamson - Tenor Saxophone and Flute
Chris Buckland - Tenor Saxophone
Matthew Allison - Trombone
Daniel Hayles - Piano
Seth Boy - Bass
Abe Baillie - Drums
Mana Waiariki - Violin
Eden Annesley - Violin
Abby Wheeler - Viola
Lavinnia Rae - Cello

The Outside World

Out in the wider world, things are a mess. A world order is collapsing. But still, good music is being made.

Gogo Penguin - Necessary Fictions

Released: June 20, 2025

Necessary Fictions is the seventh studio release from GoGo Penguin, not counting an album of remixes, a couple of live albums and a slew of singles and EPs. GoGo Penguin’s sound exists on the border between jazz and electronic music, with their latest shifting further towards electronica.

The 2023 release from this crew, Everything is Going to Be OK carries a message to counter the pervasive feeling of doom.

Aaron Parks - By All Means

Released: November 7, 2025

Seattle native pianist Aaron Parks’ By All Means on the Blue Note label is just good straight-ahead jazz. Nothing wrong with that. Parks is a prolific sideman, for instance, playing on guitarist Tom Ollendorff’s tasteful Where in the World.

Aaron Parks - piano
Ben Solomon - tenor saxophone
Ben Street - bass
Billy Hart - drums

Raven Gnosis

Released: February 14, 2025

Not to be confused with the 1980s metal band, San Francisco–based electronic musician Raven brews up a batch of synth pads washed over jazzy chord progressions, equal parts introspective and meditative.

Other good stuff

Artemis - Arboresque
Shai Maestro - Solo: Miniatures & Tales
Julia Hülsmann - Under the Surface
Tom Ollendorff - Where in the World
lvdf EP
Tourismo - Torque (2023)
Lucy Clifford - Between Spaces Of Knowing (2024)
Antipodes - Good Winter (2018)

NZSO

The New Zealand Symphony is a world class orchestra with a colorful 79 years of history. I saw two performances in 2025.

The first I picked based on the presence of Firebird on the program, which was perfomed with fire and sweetness. I was really impressed by special guest pianist Javier Perianes, and a piece by Manuel de Falla called Nights in the Gardens of Spain.

Conductor: Joana Carneiro Piano: Javier Perianes

J Ritchie Papanui Road Concert Overture
de Falla Nights in the Gardens of Spain
Ravel Piano Concerto in G Major
Stravinsky Suite from The Firebird (1945 version)

In the second show, André de Ridder conducted three pieces that share the theme of enchantment - Modest Mussorgsky’s Night on Bald Mountain, Paul Dukas’ The Sorcerer’s Apprentice and Igor Stravinsky’s Petrushka.

3 Shades of Blue

Kind of Blue is one of the great works of American culture, a high water mark. For those interested in the history and the personalities behind the music, James Kaplan’s 3 Shades of Blue is a great read centered around the making of that album. If you’re on Apple Music, check out the 3 Shades of Blue playlist.

End of a Golden Age

So much in the world is moving backwards, degraded by retrograde authoritarianism and tribalism. It’s healthy to intentionally appreciate what humans at our best can achieve, deeply flawed creatures that we are. Wrecking things is easy. Making good things is hard. But creating will always be cooler.

Music 2024

Books 2025

2026-01-11T00:00:00+00:00

Babel - R.F. Kuang
Orbital - Samantha Harvey
Erasure - Percival Everett
3 Shades of Blue - James Kaplan
Solaris - Stanislaw Lem
I’m Starting to Worry About This Black Box of Doom - Jason Pargin
Tomorrow & Tomorrow & Tomorrow - Gabrielle Zevin
Player Piano - Kurt Vonnegut
Hum - Helen Phillips
Polostan - Neal Stephenson
Playground - Richard Powers
In Ascension - Martin MacInnes

The standouts, for me, were Orbital, Hum, and Playground.

本を読む

Honwoyomu - “read a book”

Learning Visual Context by Comparison

2025-12-14T00:00:00+00:00

TL;DR

The paper Learning Visual Context by Comparison describes the “attend-and-compare” module, a light-weight attention mechanism that can be added on to ResNet-style computer vision models. It was published by folks from Lunit on arXiv in July 2020 and was a conference paper at ECCV 2020.

Since Lunit is now the parent company of my present employer, I thought it would be nice to find out something about their cool ideas. First, let’s meet the authors.

Authors

Minchul Kim (김민철) - (Lunit: 2018-2021), now at Google.
Jongchan Park (박종찬) - (Staff Research Scientist at Lunit: 2018–current), organized Multi-Modal Foundation Models for Cancer Detection and Prevention.
Seil Na (나세일) - (Research Scientist, Lunit 2019-current)
Chang Min Park (박창민) - Radiology, Seoul National University Hospital
Donggeun Yoo (유동근) - (Co-founder & Chief of Research at Lunit)

Several founders and early Lunitians were lab-mates in a group run by In So Kweon at Korea Advanced Institute of Science and Technology (KAIST) in Daejeon, South Korea. KAIST is a public research university founded in 1971, known as the “MIT of South Korea”. This paper builds on CBAM: Convolutional Block Attention Module (2018) from that lab.

Professor In So Kweon (권인소) at KAIST Korea Advanced Institute of Science and Technology

Context

This paper came during a wave of exploration of variants of CNN architectures, especially variants incorporating an attention mechanism.

Timeline of Computer vision

AI slop timeline of computer vision, 1989-2020

Backpropagation Applied to Handwritten Zip Code Recognition, Y. LeCun et al, 1989
AlexNet, ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, 2012
Neural Machine Translation by Jointly Learning to Align and Translate, September 2014
ResNet, December 2015
YOLO (You Only Look Once), June 2015
Attention Is All You Need, June 2017
Vision transformers, October 2020

Convolutional Neural Networks

As background, recall that convolutional networks learn feature maps whose complexity increases with depth in the network.

Attend-and-compare method

Inspiration

“We paid attention to how radiology residents are trained, which led to the following question: why don’t we model the way radiologists read X-rays? When radiologists read chest X-rays, they compare zones, paying close attention to any asymmetry between left and right lungs, or any changes between semantically related regions, that are likely to be due to diseases.”

“In this paper, we present a novel module, called Attend-and-Compare Module (ACM), that extracts features of an object of interest and a corresponding context to explicitly compare them by subtraction, mimicking the way radiologists read X-rays.”

ACM derives a comparison signal from two attention-pooled prototypes, \(K-Q\). The squeeze-and-excitation gate (P) then reweights the channels of (X+(K-Q)).

Computing the ACM block

The ACM block computes K by projecting the feature map \(X\) with \(W_K\), which is a 1×1 convolution producing a single-channel score map. Then softmax over spatial locations \((i,j)\) to produce a \(C \times 1 \times 1\) vector. Q is computed similarly.

\[X \in \mathbb{R}^{C \times H \times W}\]

This is equation (2) of the paper:

\[K = \sum_{i,j \in H,W} \frac{ \exp(W_K X_{i,j}) }{ \sum_{h,w} \exp(W_K X_{h,w}) } X_{i,j}\]

I think \(K\) and \(Q\) are little easier to read like this:

\[K = \sum_{i,j} \operatorname{softmax}_{(i,j)\in H\times W}\!\big(W_K X\big)_{i,j}\, X_{i,j}\] \[Q = \sum_{i,j} \operatorname{softmax}_{(i,j)\in H\times W}\!\big(W_Q X\big)_{i,j}\, X_{i,j}\] \[K, Q \in \mathbb{R}^{C \times 1 \times 1}\]

Subtracting \(K-Q\) highlights contrasting features. It might learn to compare left vs right lung, although specific comparisons are learned and not baked into the algorithm. The algorithm doesn’t enforce anatomical pairing (left-right symmetry) unless the data and training encourage it. To get the result of the ACM we add that difference between \(K\) and \(Q\) back to the feature map \(X\) and multiply by \(P\).

\[F_{acm}(X) = P(X + (K-Q))\]

…where \(P\) is a squeeze-and-excitation block whose job it is to highlight features that are helpful for solving the task at hand.

\[P = \sigma \circ \text{conv}^{1\times1}_{2} \circ \text{ReLU} \circ \text{conv}^{1\times1}_{1}(\mu)\]

CNNs with Attention

The late twenty-teens was a time of exploration of architectures combining attention and convolutions. A nice review paper that covers these kinds of methods is Visual attention methods in deep learning: An in-depth survey Hassanin, 2022 from which we get this figure:

In the figure, (a) is the squeeze-and-excitation (SE) block. Convolutional Block Attention Module (CBAM), (b), is a channel-and-spatial attention module for learning to attend to informative regions. CBAM is prior work from the same lab at KAIST. In contrast to ACM’s spatial attention pooling, CBAM learns an explicit spatial gating mask \(M_s\in\mathbb{R}^{1\times H\times W}\) that is applied to the feature map.

Comparison with transformer-style self-attention

In transformers, K and Q are projections of individual tokens or image patches in vision transformers and they are used to compute pairwise attention. In ACM, K and Q are globally-pooled summaries of two contrasting regions which are subtracted to explicitly model comparison. This signal is then injected back to the feature map with the effect of highlighting difference.

\[Q = XW_Q,\quad K = XW_K,\quad V = XW_V\] \[\text{Attention}(Q, K, V) = \text{softmax}\!\left(\frac{QK^\top}{\sqrt{d}}\right)V\]

Transformer-style attention is quadratic \(N \times N\) in the number of input tokens, providing pairwise mixing of information. By pooling over the spatial dimensions, ACM is lighter with more like linear scaling. \(K\) and \(Q\) are two global attention poolings whose difference is beneficial to recognition tasks. By splitting channels into groups and learning separate attention maps for each group, ACM gains something in the same spirit as multiheaded attention.

Is it still worth knowing?

The attend-and-compare paper is from 2020 and sits in the “CNN plus attention” niche of the era before vision transformers. It’s a plug-in module for convolutional backbones that (a) learns where to look, and (b) adds channel reweighting. ACM makes explicit the inductive bias that comparison is important by encoding a difference signal and injecting it back into the feature map. ACM fits well with medical imaging tasks where compare-and-contrast is naturally informative.

The focus of research has largely moved on to vision transformers and more recently to multimodal foundation models. Modern deep learning practice improves performance by training or pretraining more general models on larger datasets. Architecture tweaks matter less than scale. Pragmatically, though, modern CNNs remain competitive with transformers especially where training data is scarce or expensive.

Advanced LLM Agents MOOC

2025-05-31T00:00:00+00:00

Image by jngai58

Notes on Advanced Large Language Model Agents, Spring 2025, an online class that picks up where Large Language Model Agents MOOC, Fall 2024 left off. In the Berkeley catalog, it’s CS294/194-280.

Advanced LLM Agents MOOC

The Advanced LLM Agents course surveys twelve topics in current research focusing on reasoning, tool use, post-training and inference-time techniques, with applications spanning coding, mathematics, web interaction, and scientific discovery. Going beyond text completion, these models operate in workflows - planning, exploring, and evaluating iteratively.

Training has evolved towards staged recipes. Curated reasoning traces on tasks with verifiable outputs — code and math — serve as grounded reward signals for reinforcement learning, augmenting human feedback.

At inference time, models are augmented with retrieval, memory systems, and tool integration. Reasoning strategies such as chain-of-thought prompting and tree-based search allow models to decompose problems, explore solution spaces and self-correct.

Finally, the course underscores the convergence of LLMs with formal reasoning and scientific discovery. Systems like lean, a mathematical programming language and theorem prover, provide rigor with while LLMs supply intuition and abstraction - a powerful combination.

Yet, as model capabilities increase, so does the potential for exploitation. The final lecture promotes security principles for safely deploying increasingly powerful agentic agents.

Instructors

Dawn Song, Professor, UC Berkeley
Xinyun Chen, Research Scientist, Google DeepMind
Kaiyu Yang, Research Scientist, Meta FAIR

Topics

Inference-time techniques for reasoning
Post-training methods for reasoning
Search and planning
Agentic workflow, tool use, and functional calling
LLMs for code generation and verification
LLMs for mathematics: data curation, continual pretraining, and finetuning
LLM agents for theorem proving and autoformalization

Reading

OpenAI blog Learning to reason with LLMs
The Bitter Lesson by Rich Sutton

Class Resources

LLM Agents Discord

Lecture 1: Inference-Time Techniques for LLM Reasoning

Xinyun Chen Google DeepMind

Solving real world tasks typically involves a trial-and-error process. Leveraging external tools and retrieving from external knowledge expand LLM’s capabilities. Agentic workflows facilitate complex tasks.

Task decomposition
allocation of subtasks to specialized modules
division of labor for collaboration

“Let’s think step by step”

In chain-of-thought prompting, we allow the model to adapt the amount of computation to the difficulty of the problem. For complex questions, the model can use more reasoning steps. Models can be trained or instructed to use reasoning strategies like decomposition, planning, analogies, etc.

Models can automate prompt design and optimize prompts. They can generate exemplars gaining the benefits of few-shot reasoning without human effort to write examples.

Explore multiple branches

We should not limit the LLM to only one solution per problem. Exploring multiple branches allows the LLM to recover from mistakes, generate multiple candidate solutions, or multiple next steps.

Self-consistency

Self-consistency is a simple and general principle in which we ask the model for several responses and select the response with the most consistent final answer. Consistency is highly correlated with accuracy.

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Sample-and-rank

Rather than counting, we can instead select the response with the highest log probability. This performs less well than self-consistency, unless the model has been specifically fine-tuned for this purpose.

She showed a nice example of clustering LLM output in the context of code generation.

Tree of thought

Using the LLM to compare or rank candidate solution, or prioritize exploration of more promising partial solutions enables us to:

increase token budget for a single solution
increase width to explore the solution space
increase depth to refine the final solution over many steps

Lecture 2: Learning to Self-Improve & Reason with LLMs

Jason Weston, Meta & NYU

Goal: An AI that “trains” itself as much as possible

Creates new tasks to train on (challenges itself)
Evaluates whether it gets them right (“self-rewarding”)
Updates itself based on what it understood

Research question: can this help it become superhuman? Can an LLM improve itself by assigning rewards to its own outputs and optimizing?

Self-rewarding LMs

Observations:

LLMs can improve if given good judgements on response quality
LLMs can provide good judgements

Train a self-rewarding language model that:

Has instruction following capability
Has evaluation capability, i.e., given a user instruction, one or more responses, can judge the quality of responses, aka LLM-as-judge

…then this model can go through an iterative process of data creation/curation training on new data. And, get better at both instruction following and evaluation in each cycle.

Lecture 3: On Reasoning, Memory, and Planning of Language Agents

Yu Su, Ohio State University

Memory

Memory is central to human learning. We recognize patterns and associations relevant to the current context. HippoRAG uses a learned associative concept map as an index for non-parametric (in context) learning. Large embedding models perform at least as well.

Reasoning

Can LLMs learn compositional and comparative reasoning?

Examples:

Barack’s wife is Michelle. Michelle was born in 1964. When was Barack’s wife born?
Trump is 78. Biden is 82. Who is younger?

Grokked transformers are implicit reasoners: extended training far beyond overfitting enables reasoning without prompting or fine-tuning.

Planning

Simplified definition: Given a goal G, decide a sequence of actions (a_0, a_1, …, a_n) that will lead to a state that passes the goal test g(•).

General trends in planning settings for language agents

Increasing expressiveness in goal specification, e.g., in natural language as opposed to formal language
Substantially expanded or open-ended action space
Increasing difficulty in automated goal test

Language Agents tutorial: https://language-agent-tutorial.github.io/

Lecture 4: Open Training Recipes for Reasoning in Language Models

Hanna Hajishirzi, University of Washington, Allen AI Institute Ai2

It’s critical for research that there be open frontier models with training process that is transparent and reproducible. Ai2 has produced produced open pretrained LLMs: OLMo, OLMo2, OLMoE and an open post-training process called Tulu. HH’s talk is on open recipes for training LLMs and reasoning models.

Overview of open recipe for training LLMs and reasoning models

Data curation

Training sets for proprietary models are kept secret, not least due to use of copyrighted material. Open models require a training set free of legal conflicts.

Open post training recipe

Post training consists of three strategies - instruction fine-tuning, preference tuning (RLHF or RLAIF), and reinforcement learning with verifiable feedback (RLVF).

Preference tuning

Preference tuning takes a base model oriented to document completion and improves its ability to follow instructions, hold a conversation, and perform reasoning tasks.

Proximal policy optimization

Proximal Policy Optimization (PPO; Schulman et al., 2017) first trains a reward model and then uses RL to optimize the policy to maximize those rewards.

\[\max_{\pi_{\theta}} \mathbb{E}_{x \sim \mathcal{D}, y \sim \pi_{\theta}(y \mid x)} \left[ r_{\phi}(x, y) \right] - \beta \mathbb{D}_{\text{KL}} \left[ \pi_{\theta}(y \mid x) \,||\, \pi_{\text{ref}}(y \mid x) \right]\]

Direct policy optimization

Direct Preference Optimization (DPO; Rafailov et al., 2024) directly optimizes the policy on the preference dataset; no explicit reward model.

\[\mathcal{L}_{\text{DPO}}(\pi_{\theta}; \pi_{\text{ref}}) = - \mathbb{E}_{(x, y_w, y_l) \sim \mathcal{D}} \left[ \log \sigma \left( \beta \log \frac{\pi_{\theta}(y_w \mid x)}{\pi_{\text{ref}}(y_w \mid x)} - \beta \log \frac{\pi_{\theta}(y_l \mid x)}{\pi_{\text{ref}}(y_l \mid x)} \right) \right]\]

SimPO

SimPO (Meng et al., 2024) does not use a reference model.

\[\mathcal{L}_{\text{SimPO}}(\pi_{\theta}) = - \mathbb{E}_{(x, y_w, y_l) \sim \mathcal{D}} \left[ \log \sigma \left( \frac{\beta}{|y_w|} \log \pi_{\theta}(y_w \mid x) - \frac{\beta}{|y_l|} \log \pi_{\theta}(y_l \mid x) - \gamma \right) \right]\]

GRPO

Group relative policy optimization (GRPO) was introduced in the DeepSeek R1 paper. For each question 𝑞, GRPO samples a group of outputs {𝑜1, 𝑜2, …, 𝑜𝐺} from the old policy \(\pi_{\theta_{\text{old}}}\) and then optimizes the policy model \(\pi_{\theta}\) by maximizing the following objective:

\[\mathcal{J}_{GRPO}(\theta) = \mathbb{E} \left[ \frac{1}{G} \sum_{i=1}^{G} \min \left( \frac{\pi_{\theta}(o_i \mid q)}{\pi_{\theta_{\text{old}}}(o_i \mid q)} A_i, \text{clip} \left( \frac{\pi_{\theta}(o_i \mid q)}{\pi_{\theta_{\text{old}}}(o_i \mid q)}, 1 - \epsilon, 1 + \epsilon \right) A_i \right) \right] - \beta \mathbb{D}_{\text{KL}} \left( \pi_{\theta} \| \pi_{\text{ref}} \right)\] \[A_i = \left( \frac{r_i - \bar{r}}{\sigma_r} \right)\]

Or, as a loss to be more comparable to the other preference tuning losses:

\[\mathcal{L}_{GRPO}(\theta) = - \mathbb{E} \left[ \frac{1}{G} \sum_{i=1}^{G} \hat{A_i} \right] + \beta \mathbb{D}_{\text{KL}} \left( \pi_{\theta} \| \pi_{\text{ref}} \right)\]

…where \(\hat{A_i}\) is the clipped policy advantage of the ith output in the group.

Each output is assigned a reward \(r_i\), and the rewards are normalized by subtracting the mean and dividing by the standard deviation. This group-based reward normalization is more efficient by eliminating the need for a separate value function, and appears effective in practice.

I’m still working on understanding GRPO, so don’t take this as gospel.

Mid-training

Given the observation that post-training yields larger improvements on more capable base models, how do we go about trying to improve reasoning capabilities of base models? A mid-training stage can be inserted at the end of pre-training that trains on next-token prediction on curated high-quality data including human curated reasoning traces, and math and coding with a verifiably correct answer.

Lecture 5: Coding Agents and AI for Vulnerability Detection

Charles Sutton, Google DeepMind

Evaluating coding agents

SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. The dataset collects 2,294 Issue-Pull Request pairs from 12 popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution.

SWE-agent lets your language model of choice (e.g. GPT-4o or Claude Sonnet 3.5) autonomously use tools to fix issues in real GitHub repositories, crack cybersecurity challenges, etc.

The ReACT loop

Repeat until timeout, error, or success

LLM generates text given current trajectory
Run tools from LLM output
Append tool output to trajectory

Passerine Google’s coding agent for bug fixes.

AI For computer security

LLMs playing capture the flag. Agentic techniques seem particularly natural fit. Lists datasets and a few challenge competitions.

Lecture 6: Web Agents

Ruslan Salakhutdinov, CMU/Meta

Rus spoke on multimodal autotonomous AI agents for performing tasks on the web. The agents were trained and evaluated in mock environments, with mock shopping, search, etc.

Tree search benefits accuracy in these contexts, but is slow and suffers from hard-to-undo actions. Synthetic data can also help - use Llama to generate and verify synthetic agentic tasks.

Common vs rare cases

A general principle, maybe: Training techniques that work well for common cases where there are lots of labeled examples to train on differ from techniques that work well in rare cases.

Examples:

Synthetic data can cover whole problem space evenly. Human-curated real world data covers limited parts of the problem space.
CBOW vs skip-grams in word2vec: skip-grams seem to be helpful for rare words.

Multistep tasks

AI Agents are brittle to multistep tasks due to compounding errors. If they go wrong, they have difficulty recovering. Work needed on self-correction and recovery.

Common failure modes

long horizon reasoning and planning
getting stuck looping or oscillating
correctly performing tasks then undoing them
stopping exploration or execution too early

Attacks

An attack on web agent is to hide instructions in HTML comments, figure captions, etc.

Reinforcement learning framing

POMDP - Partially observable Markov decision process.

\[\mathcal{E} = {\mathcal{S}, \mathcal{A}, \mathcal{O}, \mathcal{T}}\] \[\mathcal{S}: state\] \[\mathcal{A}: actions\] \[\mathcal{O}: observations\] \[\mathcal{T}: Transition function\] \[\mathcal{T}: \mathcal{S} \times \mathcal{A} \rightarrow \mathcal{S}'\]

Lecture 7: Multimodal Agents

Caiming Xiong, Salesforce AI Research

Another talk on multimodal agents for doing computer-based tasks. OSWorld is a virtual machine environment in which agents can perform tasks like coding, data analysis, making slides, manipulating images, and editing documents. Tasks come with evaluation scripts, a process they call execution-based evaluation.

Data Challenges for Agent Training

Agent models require expensive human annotation to collect agent trajectory data.
This contrasts with LLMs, which leverage existing text corpora.
Human annotation is time-consuming, costly, and limits scalability.
The cost and complexity of human annotation make it difficult to collect diverse and large-scale agent trajectory data.

Why not let models synthesize? What they did is agent trajectory synthesis via guiding replay with web tutorials.

CoTA: Chains-of-Thought-and-Action

Lecture 8: AlphaProof - When RL meets Formal Maths

Thomas HUBERT, Google DeepMind

Mathematics is a root node of intelligence. Being good at math requires:

Reasoning & Planning
Generalisation & Abstraction
Knowledge & Creativity
Open ended & Unbounded complexity
Even an eye for beauty

Formalisation in Mathematics provides:

Rigor and clarity
Efficiency and Communication
Abstraction and Generalisation
Unification
Created new fields

Lean

Lean is a programming language, theorem prover, and interactive proof assistant. It is a digital programmable formalism for mathematics that aims to be general and unified.

Reinforcement learning

Following in the footsteps of AlphaGo and AlphaFold, they hope to RL their way to superhuman math ability.

“If an agent can learn to master an environment tabula rasa, then you demonstrably have a system that is discovering and learning new knowledge by itself.”

He gives the following recipe for making a system superhuman:

Scaled up trial and error
Grounded feedback signal
Search
Curriculum

Lean gives us an environment in which to scale up trial and error with a grounded feedback signal - perfect proof verification.

Given a had problem, they generate many variants of the problem, some of which will be easier to solve. They use “test-time RL” to iterate from easier variants of the problem towards the original hard problem. Not sure I got this part.

The slides have a couple of very cool demos, which he skipped over. Maybe I can find those online somewhere.

Lecture 9: Language models for autoformalization and theorem proving

Kaiyu Yang, Meta FAIR

LLMs are frequently evaluated on math and coding tasks because they are important examples of reasoning and are relatively easy to evaluate. Math and code are deeply connected. See: Epoch AI’s FrontierMath benchmark.

Training LLMs for math

How NuminaMath Won the 1st AIMO Progress Prize, Fleureau et al, 2024

Supervised Finetuning on mathematical date - math overflow pages or papers from arXiv.
SFT on problems with step-by-step solutions.
SFT on problems with tool-integrated solutions.
RL on problems with verifiable solutions but no intermediate steps.

See paper: Formal Mathematical Reasoning: A New Frontier in AI

Lean Dojo

Kaiyu Yang was first author on LeanDojo: Theorem Proving with Retrieval-Augmented Language Models (2023), in which they use vector similarity retrieval from a library of lemmas.

Autoformalization

Bridging between informal and formal mathematical expressions. See: AlphaGeometry: An Olympiad-level AI system for geometry and Autoformalizing Euclidean Geometry in which they translate 48 problems from Book 1 of Euclid’s Elements into Lean.

Challenges in autoformalization:

Human written proofs are full of holes - reasoning gaps
Evaluation is difficult

These are doable within limited domains, but how do you generalize across domains?

Lecture 10: Bridging Informal and Formal Mathematical Reasoning

Sean Welleck, CMU

Sean introduces Lean-STaR which is an RL system to train LMs to produce mathematical proofs by interleaving informal thoughts and steps in the proof, sampling lots of coninuations of the proof, and getting reward signal from Lean’s proof verification.

Two approaches:

Draft-sketch-prove
LeanHammer

A hammer theorem prover is a tool that uses external automated theorem provers (ATPs) to help find proofs within a larger proof assistant system.

Lecture 11: Abstraction and Discovery with Large Language Model Agents

Swarat Chaudhuri, UT Austin

Mathematical Discovery with LLM Agents

Challenges with NNs for proof generation

Data scarcity

Need traces or reward functions that enable rigorous mathematical reasoning
This is difficult beyond high-school or competition settings.

Lack of verifiability

Natural-language reasoning is hard to verify
In applications like system verification, edge cases are especially critical.

In-context learning for theorem proving

LLMs and in-context learning are powerful tools for mathematical reasoning.
- RAG over lemmas.
- Feedback from theorem prover, Coq or Lean.
Similar techniques work for program verification.

AI for Scientific Discovery

Symbolic Regression - LLM Agents for Empirical Discovery

How do we derive a physical law from observed data? Genetic algorithms are often used for symbolic regression. But what if we use an LLM to generate candidate programs, like the cross-over step in evolutionary algos.

LLM-directed evolution is a powerful tool for empirical scientific discovery.
Frontier LLMs inject prior world knowledge into mutation/crossover operators.
LLMs can be used to learn abstract concepts that accelerate evolution.
All this can be applied to settings with visual inputs as well.

Concluding notes: Combinations of agentic LLMs with other machinery are a very powerful tool for discovery.

Lecture 12: Towards building safe and secure agentic AI

Dawn Song, UC Berkeley

Attacks on Agentic Systems

SQL injection using LLM
Remote code execution (RCE) using LLM
Direct/Indirect Prompt Injection
Backdoor

Prompt Injection Attack Surface

Manipulated user input
Data poisoning: open datasets, documents on public internet

Decodingtrust.github.io - Comprehensive Assessment of Trustworthiness in GPT Models

Defense Principles

Defense-in-depth
Least privilege & privilege separation
Safe-by-design, secure-by-design, provably secure

Defense Mechanisms

Harden models
Guardrail for input sanitization
Policy enforcement on actions
Privilege management
Privilege separation
Monitoring and detection
Information flow tracking
Secure-by-design and formal verification

Scenius

2025-02-15T00:00:00+00:00

Now and then, the right mix of creative people come together in favorable conditions and something special happens. They begin riffing off each other, learning from and building on each other’s work. When everything lines up just right, the result is an intellectual and artistic efflorescence, humans at their most creative.

Brian Eno coined a term for this: “Scenius stands for the intelligence and the intuition of a whole cultural scene. It is the communal form of the concept of the genius.”

Kevin Kelly writes, the geography of scenius is nurtured by several factors:

Mutual appreciation — Risky moves are applauded by the group, subtlety is appreciated, and friendly competition goads the shy. Scenius can be thought of as the best of peer pressure.
Rapid exchange of tools and techniques — As soon as something is invented, it is flaunted and then shared. Ideas flow quickly because they are flowing inside a common language and sensibility.
Network effects of success — When a record is broken, a hit happens, or breakthrough erupts, the success is claimed by the entire scene. This empowers the scene to further success.
Local tolerance for novelties — The local “outside” does not push back too hard against the transgressions of the scene. The renegades and mavericks are protected by this buffer zone.

When I was first working in the Bay Area, I found that there were two types of smart people. There were smart people that made you feel dumb and useless. And then, there were smart people that, if only for a moment, lifted you up to their level. You could see further and do more in their proximity. I decided, as much as possible, to gravitate toward the latter type.

A few key individuals of exceptional talent and the right mindset can anchor a scene that grows and sustains itself. In a small way, all you need is a few people being in flow together. The key is that it’s a shared experience. “Genius is individual, scenius is communal.”

Where do you see creative things happening around you that need support or nurturing? We should all look for ways to help virtuous cycles start spinning.

Famous examples

New York City jazz scene of the 1950s
Paris in the 1920s
Mathematics at the University of Göttingen
Physics in the early 20th century
The Lunar Society
The Viennese Classical Period
Rock climbers at Camp 4 in Yosemite
The Algonquin Round Table
The Harlem Renaissance
Bell Labs
Xerox PARC
Silicon Valley
Florence during the Renaissance
Baghdad’s House of Wisdom

Techniques of Suppression and Counterstrategies

2025-02-02T00:00:00+00:00

I’ve worked in lots organizations with fairly typical levels of dysfunction and in one that could in fairness be described as toxic. From that, I took away a personal appreciation for the effect, positive or negative, the social climate of a workplace has on its members.

Human beings are a tribal animal and as such our instincts are to form groups and establish hierarchy within them. Those instincts can work in our favor when they appeal to our better nature or can be exploited to manipulate us and work against our interests.

Wisdom of the Universe by Christi Belcourt

How do you identify and assess toxicity in a workplace or any other group? How and why is that kind environment created and maintained? As individuals, can we actively contribute to a better environment? If the man is keeping us down, we don’t have to passively accept it. We can observe how it works, push back, and try to improve the system in which we operate.

The term “master suppression techniques” was introduced by Norwegian psychologists Ingjald Nissen and Berit Ås to describe strategies of social manipulation by which a dominant group maintains their position by undermining others.

To some degree, these domination techniques are default human behaviors that we fall into without thinking. It takes conscious effort to push back against the worst in our natures. When leaders practice these dark patterns of behavior and, by example, give others license to do the same, things get ugly.

It’s important to realize that the same environment can be experienced differently by different people. If you’re in the in group, you’re having a different experience from those in the out group and vice versa. Your mileage may vary along dimensions of experience, skillset, gender, ethnicity, personality, etc. An environment that’s great for seniors might be bad for juniors, great for engineering but bad for data science.

The Empowerment Network at Stockholm University proposes counterstrategies to combat dominance techniques and affirmation techniques to promote healthier social climates. Their work was done in the context of women in the workplace, but really applies to any group of humans. Below is a translation (thanks GPT4o) and summary of their paper Bekräftartekniker och motstrategier - sätt att bemöta maktstrukturer och förändra sociala klimat, lightly edited by me.

Dominance Techniques and Counterstrategies

Invisibility
- Technique: Marginalizing people by ignoring or diminishing their contributions, not acknowledging their name or ideas, talking over someone in a meeting.
- Counterstrategy: Take Space – Assert yourself calmly and confidently, address the behavior directly, and demand recognition.
- Affirmation Technique: Visibility – Actively acknowledge others by listening, responding, and being constructive.
Ridicule
- Technique: Mocking or belittling.
- Counterstrategy: Question – Challenge ridicule with calm, logical responses, and directly address demeaning remarks.
- Affirmation Technique: Respect – Take others seriously. Be supportive and make everyone feel valued.
Withholding Information
- Technique: Deliberately excluding individuals from access to important information or decision-making processes, limiting their ability to act effectively, e.g., not inviting to important meetings.
- Counterstrategy: Demand Transparency - call out patterns of exclusion.
- Affirmation Technique: Transparency – Open and inclusive decision-making processes.
Double-bind
- Technique: Criticizing individuals no matter their choice. Damned if you do, damned if you don’t.
- Counterstrategy: Break the Pattern – Own your choices without internalizing the imposed conflict, acknowledge trade-offs, define priorities.
- Affirmation Technique: Double Reward – Support individual choices. Open many paths to success.
Blame and Shame
- Technique: Making individuals feel responsible for situations beyond their control, internalizing blame and undermining self-esteem. Public criticism.
- Counterstrategy: Intellectualize – Reflect on guilt and shame. Identifying external sources. Don’t internalize. Set your own standards and live by them.
- Affirmation Technique: Affirm Yourself and Others – Be positive and supportive. Emphasize validation and mutual respect. Own success and failure collectively.

Conclusions

By identifying and countering dominance techniques, individuals can take control of their circumstances and contribute to establishing a healthier workplace. These methods are practical tools to improve interaction patterns and foster more positive norms.

These oppressive strategies exist because they benefit someone. Prestige is a zero-sum game, at least in its shallow forms. Prizes, awards, and titles mean little if everyone gets one. For every winner, there has to be losers - the also-rans, the red shirts. You can stand out by diminishing those around you. But, in doing so, you pollute the environment around you and drive away the best contributors.

In a better form, social reinforcement can be more of a virtuous cycle. The best bring up the rest. Others build on our successes and we on theirs. This kind of shared flourishing results in something deeper than a charade of prestige. We influence and are influenced by those around us. When a group guides its members toward being their best selves, that character building lifts everyone. In the rare best case, everyone is riffing off of each other like jazz musicians in 1950’s New York or mathematicians at Göttingen. This is what Brian Eno calls scenius, which he defines as “the intelligence and intuition of a whole cultural scene.”

If you find yourself in a toxic environment, do yourself a favor and get out. But, every organization has some level of dysfunction. Creating the kind of working life we want takes effort. If we want to work in a good environment, we have to play a part in making it good. Falling into negative behaviors is easy and contagious. With enlightened self-interest, we have to choose to interact in a positive way and expect the same of those around us.

Orbital by Samantha Harvey

2025-01-31T00:00:00+00:00

“All your dreams of adventure and freedom and discovery culminate…and then you go nowhere but round and round with the same old thoughts going round and round with you.”

Orbital by Samamtha Harvey is a beautiful and contemplative book. It’s part of a growing subgenre of books in which nothing much happens, but takes to another level by coming to a climax when all the characters are asleep. Even so, the book covers a lot of territory - birth, death, religion, new love blossoming, old love fading, climate change, the insignificance of individuals and even of humanity in general, political division, simple bonds of shared experience… all the important topics.

“Our lives here are inexpressibly trivial and momentous at once…”

“A human being was not made to stand still.”

Those looking for a space book brimming with technical detail will not find it here. The focus is on the interior, an examination of our world from the outside that is inside our heads.

For those wondering where the plot is, it’s right here:

A book of vivid images

There are a few images that play a role in the book. The first is a 1656 painting by Diego Velázquez called Las Meninas, which is used to discuss perspective.

Michael Collins, the astronaut who took this photo of the Lunar Module from the Apollo 11 mission in 1969, is the only human, alive or dead that isn’t in the frame of this picture.

The Earth from the International Space Station.

Orbital is a love letter to our home planet, “a celebration of the Earth […] with a pang of loss.” It’s the change of perspective that many of us need right now.

More about Orbital

Music 2024

2025-01-26T00:00:00+00:00

Although few could tell the difference, this is not actually an image of me enjoying some tunes in my study, but rather a nefarious piece of AI slop! Specifically, “A scholarly older gentleman in the style of a medieval illuminated manuscript wearing headphones.” (by DALL·E)

What a great year for music, with an abundance of new releases and newly discovered goodies. We recommend queuing up the 2024 playlist before reading further.

Streaming services are so good for exploratory listening. Following the threads of “if you like this, you’ll probably like that…” will lead you down some very enjoyable rabbit holes.

On the other hand, it remains to be seen whether streaming music is sustainable for musicians or even technology companies. Enshitification is taking hold. Platforms are being flooded with ghost music, fake artists, and AI generated dreck. In the race to the bottom, you can’t outrun spotify. Funny how tech folks, myself included, spent years preaching “disintermediation” only to devolve into online middlemen and virtual robber barons - Amazon, Spotify, Uber, etc. One day, all will be replaced by open protocols and optimized for human flourishing rather than “engagement” which is a proxy for addiction and ad revenue.

For now, like ocean currents washing up pretty shells onto golden beaches, the algorithms surface plenty of organic hand-crafted tunes and I am grateful to our digital overlords for that.

Live music

The best way to be sure you’re getting real human-generated music? See it live. Showing up and dropping a bit of disposable income on tickets, drinks, and snacks helps keep the scene alive and lets creative folks know you value what they do. I’m old and lame and don’t get out much, but I saw some good shows this year.

There were a few pleasant Sunday afternoons spent taking in a jazz set at Undercurrent Books. Esperanza Spalding graced the 2024 Wellington Jazz Festival. Japanese math-rock legends Toe closed out their APAC tour to a packed house at Meow. Louisa Williamson’s Heavy Flow series of shows feature a rotating cast of very talented Wellington musicians having fun. I finally got out to the last one of 2024.

While visiting Toronto, I briefly became a regular at the Rex. I grooved to a couple of piano trios, one consisting of Duncan Wilson (p.), Chris Parnis (b.), and Petros Anagnostakos (d.), and a second trio led by bassist Jesse Dietschi. I keep coming across solid bassist-led jazz ensembles, for instance Ben Wolfe, Michael Janisch, and Michael Feinberg. You can hear the Jess Dietschi Trio’s thoughtful chamber jazz on their 2023 album Gradient, which I highly recommend.

In Toronto’s lovely Koerner Hall, I took in a recital of Handel, Beethoven, and Prokofiev performed by Vadym Kholodenko, sort of a journey through time from baroque to high classical to modern.

Hyperion Records

Speaking of classical, Hyperion Records was one of the last holdouts against streaming. In February of 2023, Hyperion was acquired by the Universal Music Group and in the later half of that year their deep and wonderful catalog started appearing on streaming platforms.

It’s hard to do justice to this trove of music, but, for me, one highlight is Angela Hewitt’s gorgeous piano playing. She’s known for Bach, but there’s so much more, in particular her sweetening of Ravel’s sometimes harsh piano works. Stephen Hough’s playing of Mompou and Lívia Rév’s of Debussy are also top notch.

Are Universal and the streaming platforms strip-mining a lovingly crafted body of work impoverishing the creators and curators in the bargain? I truly hope an economic model emerges that rewards the best in human creativity and enables ‘musician’ to be a viable profession while ‘music industry executive’ follows the cooper and wheelwright into obsolescence. In the meantime, hearing this music has certainly enriched my small corner of the world.

Labels

No streaming platform that I know of does a good job of representing record labels, which is a shame. But, Hyperion isn’t the only label putting out great stuff. A good label curates a stable of musicians that cohere. I found drummer, saxophonist, and bandleader Karl-Henrik Ousbäck through Fresh Sound Records. They’ve also put out music by Ari Hoenig, Tom Ollendorff, George Colligan, and Simon Moullier among others. Other labels worth checking out include ECM, Blue Note, Criss Cross, Sunnyside Records, Origin Records, Edition Records, and ACT. If the algorithms don’t serve up the good stuff, go to the source.

New releases

A lot of great music came out this past year. Below is a list of those that, for whatever reason, resonated with me. If an AI apocalypse is coming to destroy music, it hasn’t hit just yet. If we discount the possibility that we’re all living in a simulation, I’m pretty sure all of the following are real live human musicians.

Living 12 Strings - Anders Miolin
Mid Spiral - BADBADNOTGOOD
The Understated - Ben Wolfe
HOPE - Bill Laurance
Bloom - Bill Laurance, The Untold Orchestra & Rory Storm
Keeping Company - Bill Laurance & Michael League
Après Fauré - Brad Mehldau
Call For Winter II: Resonance - Daniel Herskedal
Echoes of Solitude - Daniel Herskedal, Mattia Vlad Morleo
To the APhEX - Dorian Dumont
Spirit Box - Flying Lotus
From the North - GoGo Penguin Live in Manchester
The Bounding Line - Greg Reitan
In A Landscape - Max Richter
Moodial - Pat Metheny
Code Derivation - Robert Glasper: Wake Up is a great song.
Fruit Galaxy - standards
Complex Emotions - The Bad Plus: I especially like the song French Horns from the new guitar-driven Bad Plus.
Now I See The Light - Toe
Continuum - Víkingur Ólafsson: Icelandic pianist Víkingur Ólafsson followed up a 2023 recording of Bach’s Goldberg Variations with a 6-song EP of selected Bach pieces. To all who caught Ólafsson’s tour last year, I am deeply envious.
Standards - Yotam Silberstein

Dorian Dumont

If you’ve had a years-long obsession with Aphex Twin covers on acoustic instruments (and who hasn’t?) you might have been a bit giddy when Brussels-based pianist Dorian Dumont released a second album of Aphex Twin covers. Dumont’s lush and meditative version of Rhubarb is a highlight, infused with scattered raindrops of dissonance precipitated from Bill Evan’s Peace Piece. Windowlicker morphs into a banger of a piano prelude. I’m a dedicated tech enthusiast, there’s something satisfying about reclaiming electronic music for the organic. Humans can still hold our own with the machines.

Following threads, new Discoveries

Tracing connections to discover new music is fun—even if it’s just new to me. For example, Seattle native pianist Aaron Parks played on saxophonist Dayna Stephens’ 2015 album Reminiscent who plays on the tracks Minority and Infant Eyes on Victor Gould’s In Our Time, which is an album that keeps growing on me. Victor Gould is in Black Art Jazz Collective who released the very solid Truth to Power in 2024.

Robert Glasper and Nicholas Britell collaborated on the soundtrack to an HBO series called “The Winning Time Sessions”. Apparently, the show is about basketball, a sport I was so bad at, my middleschool coach told me to go back to my skateboard. These bite-sized groove-oriented tracks will help you bang out some code. A short click away, you’ll find Britell’s tender soundtrack for “If Beale Street Could Talk”.

I’ve been a math rock nerd for some time, falling hard for Tricot, Rega, Toe, Covet, Piglet, Floral, Don Caballero, Tortoise, Fago Sepia… So, I a bit flummoxed as to how I missed Japanese math rockers Hyakkei. Luckily, my 17-year-old is developing an ear for good music and said I would like them. Yep, I do. Score one for the human recommender system.

O’Higgins & Luft

Saxophonist Dave O’Higgins & guitarist Rob Luft recorded an excellent album of Monk and Coltrane covers in 2018 and followed that up with the equally good Pluto in 2022.

Chosen jazz

Early in the year, guitarist Gilad Hekselman put out an excellent cover of Coltrane’s Equinox. Following the “you might also likes” led to a whole slew of incredible Israeli jazz musicians including guitarist Yotam Silberstein, and pianists Yaron Herman, Nitai Hershkovits and Anat Fort. Fort’s piece “First Rays” from the 2016 release Birdwatching by Anat Fort Trio with Gianluigi Trovesi is quite good.

The Garden Suite by Nitai Hershkovits & Daniel Dor is a unique piece of music. If Frank Zappa and Aphex Twin had collaborated on the soundtrack to a Studio Ghibli film, it might sound something like this.

Analysis paralysis

Last.fm gives some nice end-of-year listening analysis. Apparently, I had poor listening stamina in January through February and again in July. Next year, I’ll do better.

Glad to see that I was 9% more obscure in 2024 than my already obscure self of 2023. I just keep moving out into that long tail of “nobody gets that weird shit you listen to.”

Happy 2025

Whether the coming year finds us hiding underground from the all seeing eye of Skynet or transcending our Earthly bonds to become beings of light in rapturous singularity, happy listening in 2025.

Tongariro Alpine Crossing

2025-01-14T00:00:00+00:00

The Tongariro Alpine Crossing is a challenging 20km one-way hike through volcanic landscape. The trail climbs 765m (~2510 ft.) from the trail head at the end of Mangatepopo road up to the Red Crater summit and then down over 1000m to the pickup point at Ketetahi.

Three volcanoes sit at the center of New Zealand’s North Island. Tongariro is farthest North. Ruapehu to the South is the highest at 2,797m (9176 ft.). Between them is conical Ngauruhoe, also known as Mount Doom. See Google Maps.

We took Dempsey Buses, which is a great way to get to and from the trail - much better than having to drive and navigate while exhausted. They’ll pick you up bright and early, aiming to have you on the trail by 7am.

We picked a blue sky day. You’ll want to check NIWA weather forcast for the Red Crater summit. My kiddos age 12 (almost) and 17 easily outpaced their tired old man. The driver said there were about 1500 people on the trail that day while a really busy day might see up to 3000, so don’t expect splendid isolation.

The trail is tough. It’s steep going up and coming down. The lower sections have lots of wood-framed steps. The upper section of trail the leads up to the Red Crater summit, at 1868m (6129 ft.), is crumbly gravel and fine powder dust. My fitbit said I took over 35,000 steps.

At Ketetahi, there’s a shady shelter to wait in. The day I was there, buses came to the pickup at 2:30, 4, and sometime later for the stragglers. If we made the 4 o’clock, you can too. You’ll be showered and sitting down to dinner at a very civilized hour.

Ohakune

Ohakune is a cool little town. It has great views of snow-capped Ruapehu and a few restaurants - we enjoyed Osteria. The ISite tourist office is friendly and helpful. If you’re not worn out yet, there’s a great little rock wall with autobelays at Vertigo Climbing Ohakune.

Books 2024

2025-01-04T00:00:00+00:00

Seneca said, “distringit librorum multitudo”, meaning “the abundance of books is distraction”. But, what a pleasant distraction.

Wildwood - Colin Meloy (parts 2 & 3) (with Henry)
Liberation Day - George Saunders
God, Human, Animal, Machine - Meghan O’Gieblyn
Starter Villain - John Scalzi
Designing Machine Learning Systems - Chip Huyen
Stoner - John Williams
She’s a Killer - Kirsten McDougall
Birnam Wood - Eleanor Catten
Deep Learning Book - Ian Goodfellow (Ch. 1-5)
Other Minds - Peter Godfrey-Smith
Coming Through Slaughter - Michael Ondaatje
The Wide Wide Sea - Hampton Sides
3 Shades of Blue - James Kaplan (started)
Babel - R.F. Kuang (started)