Divyat Mahajan

I am a final year Ph.D. candidate at Mila & Université de Montréal, advised by Ioannis Mitliagkas. Most recently, I was a visiting researcher at Meta Super Intelligence Labs (FAIR) working on pretraining large language models with the memory and generalization team.

The central theme of my research is understanding and improving how machine learning systems generalize to novel tasks and environments. My work spans causal representation learning for robustness under distribution shifts (1), as well as modern paradigms such as in-context learning (2) and large-scale pretraining (3). Going forward, I am broadly interested in the following research directions for improving the capabilties and relaibility of foundation models.

Novel approaches for pretraining. I am interested in pretraining strategies that help language models learn richer representations and improve long-horizon reasoning & planning. A direction I find especially promising is data-constrained pretraining, where better objectives, architectures, and synthetic data may become increasingly important as compute scales faster than the supply of high-quality data.
Reusable skills for continual learning. I am interested in approaches for discovering reusable skills/strategies from reasoning traces that can help "amortize" the reasoning process. I am especially interested in exploring how to consolidate skills over time, enabling efficient adaptation to new tasks and self-improvement.
Causal approaches for alignment and safety. I am interested in alignment methods that move beyond spurious correlations and better capture the underlying intent and causal structure. In particular, I am excited by causal approaches for reward design and concept learning that entail better understanding and realiable steering of LLM behavior.

My research is supported by the FRQNT doctoral fellowship, and I am deeply grateful for the amazing collaborations that have enrinched my Ph.D. journey. I was advised by Kartik Ahuja and Pascal Vincent under the Meta AIM Program, and also did a summer internship at Microsoft Research Cambridge with Cheng Zhang and Meyer Scetbon. Further, I worked with Vasilis Syrgkanis at Stanford, and prior to Ph.D., I was a research fellow at Microsoft Research India with Amit Sharma.

Select Publications & Preprints

Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries Divyat Mahajan, Sachin Goyal, Badr Youbi Idrissi, Mohammad Pezeshki, Ioannis Mitliagkas, David Lopez-Paz, Kartik Ahuja
ICLR 2026 [arxiv] [twitter]
Amortized Inference of Causal Models via Conditional Fixed-Point Iterations Divyat Mahajan*, Jannes Gladrow, Agrin Hilmkil, Cheng Zhang, Meyer Scetbon*
TMLR 2025 (J2C Certification), ICLR 2026 [arxiv] [code]
Compositional Risk Minimization Divyat Mahajan, Mohammad Pezeshki, Charles Arnal, Ioannis Mitliagkas, Kartik Ahuja, Pascal Vincent
ICML 2025 [arxiv] [code] [presentation] [poster] [twitter]
Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation Divyat Mahajan, Ioannis Mitliagkas, Brady Neal, Vasilis Syrgkanis
ICLR 2024 (Spotlight)
[arxiv] [code] [presentation] [poster] [twitter]
Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation Sébastien Lachapelle*, Divyat Mahajan*, Ioannis Mitliagkas, Simon Lacoste-Julien
NeurIPS 2023 (Oral)
[arxiv] [code] [blog] [talk(conference)] [talk(reading group)] [presentation] [poster]
Interventional Causal Representation Learning Kartik Ahuja, Divyat Mahajan, Yixin Wang, Yoshua Bengio
ICML 2023 (Oral)
[arxiv] [code] [talk] [presentation] [poster]
Towards efficient representation identification in supervised learning Kartik Ahuja*, Divyat Mahajan*, Vasilis Syrgkanis, Ioannis Mitliagkas
CleaR 2022
[arxiv] [code] [talk] [presentation] [poster]
Domain Generalization using Causal Matching Divyat Mahajan, Shruti Tople, Amit Sharma
ICML 2021 (Oral) [arxiv] [code] [talk] [presentation] [poster]
Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers Divyat Mahajan, Chenhao Tan, Amit Sharma
CausalML@NeurIPS 2019 (Oral) [arxiv] [code] [talk] [presentation] [poster]

Select Awards & Honours

Outstanding Reviewer: ICML 2022 , ML RC 2021 , ML RC 2022
Top Reviewer: NeurIPS 2022 , NeurIPS 2024
FRQNT Doctoral Scholarship: Competition 2024-25
Academic Excellence Award, IIT Kanpur: Session 2017-18

Software

RobustDG Toolkit for Building Robust ML models that generalize to unseen domains | Github | Microsoft

Diverse Counterfactual Explanations (DiCE) for ML Toolkit to generate truthful explainations for machine learning models | Github | InterpretML