HPC for Learning

Seminars: When and Where?

In person, about once a month on a Thursday evening, with a limited number of seats. Seminars are generally hosted at SCAI (Sorbonne Center for Artificial Intelligence, see a map here) or at CICSU (Centre International de Conférences Sorbonne Université, see a map here), which are located in Jussieu (line 7). Because space is limited, registration is free but required. Click to expand the abstracts.

Upcoming

6 Feb 2026 · 18:00–19:00 — Ulysse Beaugnon (Google)
Title: ML for Systems: Practical Lessons from Deploying AI in Google's Production Infrastructure

Location: Seminar room of SCAI, Sorbonne Université, Paris. To attend, please register here.

Abstract

The foundational software systems that power machine learning advances—operating systems, compilers, and storage stacks—remain largely governed by sub-optimal and often outdated hand-tuned heuristics. As the complexity of modern systems grows and the pressure on compute resources intensifies, these static rules are becoming a bottleneck. In this talk, I will cover a few selected examples of how we applied ML to address system problems within Google production infrastructure. Beyond the success stories, I will also discuss the challenges, and lessons learned the hard way from applying ML in a system serving billions of users each day.

13 Mar 2026 · 16:00–17:00 — Alena Kopanicakova (Toulouse INP)
Title: Training of Deep Neural Networks Using Multilevel and Domain-Decomposition Methods

Location: Exceptionally, the seminar will be online. To attend, please register here.

Abstract

Training deep neural networks (DNNs) is predominantly carried out using stochastic gradient descent and its variants. While these methods are robust and widely applicable, their convergence often deteriorates for large-scale, ill-conditioned, or stiff problems commonly encountered in scientific machine learning. This has motivated the development of more advanced training strategies that can accelerate convergence, offer better parallelism, enable convergence control, and facilitate the automatic tuning of hyperparameters. To this end, we introduce a novel training framework for DNNs inspired by nonlinear multilevel and domain-decomposition (ML-DD) methods. Starting from deterministic ML-DD algorithms, we will discuss how to ensure the convergence in the presence of the subsampling noise. Moreover, we will present several strategies for constructing a hierarchy of subspaces by exploring the properties of the network architecture, data representation, and the loss function. The performance of the proposed ML-DD training algorithms will be demonstrated through a series of numerical experiments from the field of scientific machine learning, such as physics-informed neural networks or operator learning approaches.

[1] Gratton, S., Kopaničáková, A., & Toint, P. L. (2023). Multilevel objective-function-free optimization with an application to neural networks training. SIAM Journal on Optimization, 33(4), 2772-2800.
[2] Gratton, S., Kopaničáková, A., & Toint, P. (2025). Recursive bound-constrained AdaGrad with applications to multilevel and domain decomposition minimization. arXiv preprint arXiv:2507.11513.

19 March 2026 · 18:00–19:00 — Max Zimmer (Zuse Institute Berlin)
Title: Local Pruning: Efficient Post-Training Compression at Scale

Location: Seminar room of SCAI, Sorbonne Université, Paris. To attend, please register here.

Abstract

Pruning -- the removal of parameters from neural networks -- is a well-known technique for reducing the inference compute and memory requirements of large models. State-of-the-art methods for LLMs operate layer-wise, minimizing a per-layer objective on a small calibration dataset. The underlying NP-hard problem is that of finding a binary mask that determines which weights to keep, the so-called mask selection problem. While many existing approaches effectively ignore weight interactions in that selection, this talk presents two alternatives. Operating in discrete space, SparseSwaps is a local search method that refines any given mask via pairwise weight exchanges, each evaluable efficiently. On the other hand, operating in continuous space, SparseFW relaxes the combinatorial constraints to their convex hull and solves the resulting convex program using the Frank-Wolfe algorithm. Across modern GPT architectures, both methods reduce per-layer pruning error by up to 60-80% over existing approaches, with consistent improvements in perplexity and downstream accuracy.

6 May 2026 · 18:00–19:00 — Hamza Benchekroun (H company)
Title: From Text to Action: Scaling VLM Post-Training for Agents

Location: Room 109, Tower 44, CISCU, Sorbonne Université, Paris. Exceptionally, the seminar is being held on a Wednesday. To attend, please register here.

Abstract

While recent breakthroughs in distributed training have largely commoditized the training of large language models—exemplified by rapid milestones like the NanoGPT speedrun[1]—extending this efficiency to Vision-Language Models (VLMs) remains a major hurdle. As we push toward more complex agentic systems, the computational and architectural friction of integrating and aligning additional modalities at scale is becoming a significant bottleneck. In this talk, I will explore the different methods we use at H to scale multimodal post-training for agentic use cases. Beyond these techniques, I will also share the practical engineering challenges and the lessons we learned the hard way from training massive models across multiple modalities.

[1] modded-nanogpt: Speedrunning the NanoGPT baseline, Keller et.al, https://github.com/KellerJordan/modded-nanogpt

4 June 2026 · 18:00–19:00 — Eugene Belilovsky (MILA)
Title: TBD

Location: Seminar room of SCAI, Sorbonne Université, Paris. To attend, please register here.

Abstract

TBD

25 June 2026 · 18:00–19:00 — Onofrio Semeraro (CNRS/LISN)
Title: TBD

Location: Room 105, Tower 44, CISCU, Sorbonne Université, Paris. To attend, please register here.

Abstract

TBD

Past seminars

No past seminars yet.

Program Committee

Program Chairs

Edouard Oyallon — CNRS, Sorbonne University
Alexandre Allauzen — Paris Dauphine
Alexandre Défossez — Kyutai
Michael Eickenberg — H
Thomas Moreau — Inria
Sixin Zhang — IRIT

Resources

Mailing list

To register to our mailing list, please fill in this form.

Contact

To propose a talk, please send a title, short abstract, at edouard.oyallon at cnrs dot fr.

Partners

As a member of the GDR C4P, the HPC for Learning GT holds seminars under the Paris ELLIS Unit, hosted on-site in the SCAI building. We gratefully acknowledge PEPR SHARP for financial support.