Sheng Zha

Head of Model Architecture & Training Advancement, Amazon AGI

I build foundation models, open-source AI frameworks, and the teams behind them.

Sheng Zha

About

I lead the pretraining research team at Amazon AGI that supports Amazon Nova, focusing on principled scaling, model architecture, optimization, and novel pretraining objectives. My work centers on the co-evolution of algorithms and systems—the critical intersection that makes AI more capable, efficient, and accessible. My team also built the models behind Amazon Q, Titan, and the distributed training infrastructure powering Amazon Bedrock and SageMaker HyperPod.

I built this team from zero, starting in 2018 with a focus on distributed training and shared representations. What started as a small group of tech leaders and hackers grew into the engine behind foundation models serving millions through AWS.

Before that, I shaped the open-source AI ecosystem as VP and PMC Chair of Apache MXNet, where I co-authored the Gluon interface. I founded GluonNLP—the first toolkit to reproduce BERT with record-setting training speeds. I served on the ONNX Steering Committee and co-founded the Python Data API Standards Consortium. I believe accessible tools and open standards are essential for an AI future that benefits everyone.

Throughout this journey, I’ve maintained a core belief: AI should amplify human agency and ingenuity, not replace it. This principle guides my approach to both research and leadership. My leadership philosophy centers on coaching and enabling team members to grow into leaders themselves, creating a multiplier effect that has accelerated our innovation.

I hold an MS in Computer Science from the University of Maryland and a BS from Shanghai Jiao Tong University.


What I’ve Built

Amazon Nova & Foundation Model Stack

Amazon AGI, 2024–present

Leading the research team focused on pretraining that supports Amazon Nova. The team drives principled scaling, model architecture, optimization, and novel pretraining objectives—co-designing algorithms and systems to reduce the cost of intelligence across the stack.

Foundation Models for AWS AI Services

AWS, 2018–2024

Built the team and models from zero. Developed and deployed foundation models underpinning Amazon Q (CodeWhisperer), Titan, Lex, Comprehend, and Kendra.

Distributed Training Infrastructure

AWS, 2018–2023

Contributed core technology for scalable, fault-resilient training infrastructure including SageMaker HyperPod. Designed systems for training at scale with efficient resource utilization.

Apache MXNet

VP & PMC Chair, 2016–2023

Co-authored the Gluon API—an imperative, Pythonic interface that enabled just-in-time compilation for high performance without sacrificing usability. Led project maintenance and releases, project management, and community engagement as VP and PMC Chair.

20.8k GitHub stars

GluonNLP

Founded, 2018–2023

Created a deep-learning NLP toolkit for the Gluon interface. First to reproduce BERT with record-setting training speeds, accelerating NLP research across the community.

2.5k GitHub stars

ONNX

Steering Committee

Open standard for ML model interoperability. LFAI graduate project enabling framework-agnostic model deployment with hardware-optimized runtimes.

20.5k GitHub stars

Python Data API Standards

Founding Member

Consortium standardizing array APIs across NumPy, PyTorch, JAX, TensorFlow, CuPy, and Dask. Demonstrated 45x GPU speedup for scikit-learn via interoperability. Published at SciPy 2023.

ML Platform for Fraud Detection

Amazon TRMS, 2013–2015

Designed horizontally scalable machine learning platform and graph-based ML solutions for fraud and abuse detection. Built high-availability key-value stores with expressive transformation DSL for real-time feature engineering.


Selected Publications


Talks & Interviews