Sarah Liaw

Hi! I am a first-year PhD student in the ML Foundations group at Harvard University, advised by David Alvarez-Melis and Yilun Du. My research focuses on structured representation learning and generative modeling for scientific and robotics applications, and I also study training dynamics. I am grateful to be supported by a Kempner Institute graduate fellowship.

Previously, I completed my undergrad at Caltech, where I majored in Computer Science. There, I worked with Ricardo Baptista, Adam Wierman, and Anima Anandkumar on measure transport, mean-field theory and AI4PDEs. I also worked on AI safety at CHAI with Stuart Russell and Benjamin Plaut.

I am always eager to discuss ideas, and welcome any comments/questions about my research interests.

Recent News

Jan 2026 Learning When Not to Learn accepted at AISTATS 2026! Will be presenting in Tangier, Morocco in May.

Jan 2026 FOL-Pretrain will appear at EACL 2026 (Rabat, Morocco).

Dec 2025 Presented Feel-Good Thompson Sampling at NeurIPS 2025 in San Diego.

Aug 2025 Started my PhD at Harvard University! 🎓

Jun 2025 Graduated from Caltech. 🎉

Feb 2025 Presented Learning Local Neighborhoods of Non-Gaussian Graphical Models and A Renormalization Group Framework for Scale-Invariant Feature Learning in Deep Neural Networks at AAAI 2025.

Jan 2025 Excited to be working at Center for Human-Compatible Artificial Intelligence on cautious learning for AI safety!

Nov 2024 Presented work on SIS epidemic control in dynamic population networks at IEEE MIT URTC.

Selected Publications

* denotes equal contribution

	Learning When Not to Learn: Risk-Sensitive Abstention in Bandits with Unbounded Rewards Sarah Liaw, Benjamin Plaut AISTATS, 2026 abstract / arxiv / TLDR: Cautious contextual bandit algorithm with abstain option for high-stakes environments with unbounded negative rewards. In high-stakes AI applications, even a single action can cause irreparable damage. However, nearly all of sequential decision-making theory assumes that all errors are recoverable (e.g., by bounding rewards). Standard bandit algorithms that explore aggressively may cause irreparable damage when this assumption fails. Some prior work avoids irreparable errors by asking for help from a mentor, but a mentor may not always be available. In this work, we formalize a model of learning with unbounded rewards without a mentor as a two-action contextual bandit with an abstain option: at each round the agent observes an input and chooses either to abstain (always 0 reward) or to commit (execute a preexisting task policy). Committing yields rewards that are upper-bounded but can be arbitrarily negative, and the commit reward is assumed Lipschitz in the input. We propose a caution-based algorithm that learns when not to learn: it chooses a trusted region and commits only where the available evidence does not already certify harm. Under these conditions and i.i.d. inputs, we establish sublinear regret guarantees, theoretically demonstrating the effectiveness of cautious exploration for deploying learning agents safely in high-stakes environments.
	FOL-Pretrain: A complexity annotated corpus of first-order logic Isabelle Lee, Sarah Liaw, Dani Yogatama EACL, 2026 abstract / arxiv / code / dataset / TLDR: Large-scale, complexity-annotated dataset of FOL reasoning traces for algorithmic reasoning in LLMs. Transformer-based large language models (LLMs) have demonstrated remarkable reasoning capabilities such as coding and solving mathematical problems to commonsense inference. While these tasks vary in complexity, they all require models to integrate and compute over structured information. Despite recent efforts to reverse-engineer LLM behavior through controlled experiments, our understanding of how these models internalize and execute complex algorithms remains limited. Progress has largely been confined to small-scale studies or shallow tasks such as basic arithmetic and grammatical pattern matching. One barrier to deeper understanding is the nature of pretraining data – vast, heterogeneous, and often poorly annotated, making it difficult to isolate mechanisms of reasoning. To bridge this gap, we introduce a large-scale, fully open, complexity-annotated dataset of first-order logic reasoning traces, designed to probe and analyze algorithmic reasoning in LLMs. The dataset consists of 3.5 billion tokens, including 8.8 million LLM-augmented, human-annotated examples and 7.5 million synthetically generated examples. Each synthetic example is verifiably correct, produced by a custom automated theorem solver, and accompanied by metadata tracing its algorithmic provenance. We aim to provide a scalable, interpretable artifact for studying how LLMs learn and generalize symbolic reasoning processes, paving the way for more transparent and targeted investigations into the algorithmic capabilities of modern models.
	Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown Emile Anand, Sarah Liaw NeurIPS, 2025 abstract / arxiv / code / pypi / TLDR: Benchmark Feel-Good Thompson Sampling across various bandit settings. Thompson Sampling (TS) is widely used to address the exploration/exploitation tradeoff in contextual bandits, yet recent theory shows that it does not explore aggressively enough in high-dimensional problems. Feel-Good Thompson Sampling (FG-TS) addresses this by adding an optimism bonus that biases toward high-reward models, and it achieves the asymptotically minimax-optimal regret in the linear setting when posteriors are exact. However, its performance with approximate posteriors – common in large-scale or neural problems – has not been benchmarked. We provide the first systematic study of FG-TS and its smoothed variant (SFG-TS) across eleven real-world and synthetic benchmarks. To evaluate their robustness, we compare performance across settings with exact posteriors (linear and logistic bandits) to approximate regimes produced by fast but coarse stochastic-gradient samplers. Ablations over preconditioning, bonus scale, and prior strength reveal a trade-off: larger bonuses help when posterior samples are accurate, but hurt when sampling noise dominates. FG-TS generally outperforms vanilla TS in linear and logistic bandits, but tends to be weaker in neural bandits. Nevertheless, because FG-TS and its variants are competitive and easy-to-use, we recommend them as baselines in modern contextual-bandit benchmarks.
	A Renormalization Group Framework for Scale Invariant Feature Learning in Deep Neural Networks Sarah Liaw AAAI, 2025 abstract / paper / TLDR: Renormalization group theory to analyze and optimize scale-invariant feature learning in NNs. We propose a framework that uses renormalization group (RG) theory from statistical physics to analyze and optimize the hierarchical feature learning process in deep neural networks. Here, the layer-wise transformations in deep networks can be viewed as analogous to RG transformations, with each layer implementing a coarse-graining operation that extracts increasingly abstract features. We propose an approach to enforce scale invariance in neural networks, introduce scale-aware activation functions, and derive RG flow equations for network parameters. We show that our approach leads to fixed points corresponding to scale-invariant feature representations. Finally, we propose an RG-guided training procedure that converges to these fixed points while minimizing the loss function.
	Learning local neighborhoods of non-Gaussian graphical models Sarah Liaw, Rebecca Morrison, Youssef Marzouk, Ricardo Baptista AAAI, 2025 abstract / arxiv / code / TLDR: Scalable algorithm to infer conditional independence in high-dimensional non-Gaussian graphical models. Identifying the Markov properties or conditional independencies of a collection of random variables is a fundamental task in statistics for modeling and inference. Existing approaches often learn the structure of a probabilistic graph, which encodes these dependencies, by assuming that the variables follow a distribution with a simple parametric form. Moreover, the computational cost of many algorithms scales poorly for high-dimensional distributions, as they need to estimate all the edges in the graph simultaneously. In this work, we propose a scalable algorithm to infer the conditional independence relationships of each variable by exploiting the local Markov property. The proposed method, named Localized Sparsity Identification for Non-Gaussian Distributions (L-SING), estimates the graph by using flexible classes of transport maps to represent the conditional distribution for each variable. We show that L-SING includes existing approaches, such as neighborhood selection with Lasso, as a special case. We demonstrate the effectiveness of our algorithm in both Gaussian and non-Gaussian settings by comparing it to existing methods. Lastly, we show the scalability of the proposed approach by applying it to high-dimensional non-Gaussian examples, including a biological dataset with more than 150 variables.

See all publications on Google Scholar →

Template from Jon Barron