A reinforcement learning framework for intelligently switching between CVODE and Quasi-Steady State (QSS) solvers during combustion chemistry simulations to optimize computational efficiency while maintaining accuracy constraints.
This project implements a Proximal Policy Optimization (PPO) agent that learns to dynamically switch between two numerical integrators:
- CVODE: Robust but computationally expensive implicit solver
- QSS: Fast quasi-steady state solver for stiff systems
The RL agent learns optimal switching strategies to minimize computational cost while maintaining solution accuracy, particularly valuable for large-scale CFD combustion simulations.
This work addresses a critical challenge in computational fluid dynamics (CFD) combustion simulations:
- Problem: Traditional fixed-solver approaches are either too slow (CVODE everywhere) or inaccurate (QSS everywhere)
- Solution: Adaptive solver switching based on local solution characteristics
- Innovation: RL learns optimal switching policies from experience rather than hand-crafted rules
CombustionRL/
βββ environment.py # RL environment for solver switching
βββ ppo_training.py # PPO training pipeline
βββ utils.py # Solver utilities and integration
βββ reward_model.py # Custom reward functions
βββ simple_test.py # Benchmarking and testing
βββ notebooks/ # Jupyter notebooks for analysis
βββ logs/ # Training logs and checkpoints
βββ test_results/ # Performance evaluation results
The IntegratorSwitchingEnv provides a Gymnasium-compatible environment:
- State Space: Temperature, species concentrations, solver history
- Action Space: Discrete choice between CVODE (0) and QSS (1)
- Reward Function: Balances computational cost vs. accuracy
- Termination: Based on simulation completion or steady-state detection
Complete PPO implementation with:
- Policy and value networks
- Experience collection and training
- Comprehensive logging and monitoring
- Checkpoint management
- Performance evaluation
# Core dependencies
pip install -r requirements.txt
# QSS integrator (from our published package)
pip install qss-integrator
# CVODE solver (if available)
pip install SundialsPy # Optionalfrom environment import IntegratorSwitchingEnv
from ppo_training import PPOTrainer
# Create environment
env = IntegratorSwitchingEnv(
mechanism_file='gri30.yaml',
temp_range=(1000, 1400),
phi_range=(0.5, 2.0),
pressure_range=(1, 60)
)
# Train PPO agent
trainer = PPOTrainer(env)
trainer.train(total_timesteps=100000)
# Evaluate trained agent
results = trainer.evaluate(num_episodes=100)from simple_test import benchmark_solver
# Compare CVODE vs QSS performance
cvode_results = benchmark_solver('cvode', config, temperature, pressure)
qss_results = benchmark_solver('qss', config, temperature, pressure)
# Analyze computational efficiency
print(f"CVODE: {cvode_results['cpu_time']:.3f}s")
print(f"QSS: {qss_results['cpu_time']:.3f}s")Objective: Minimize computational cost while maintaining solution accuracy
minimize: Ξ£(t_cost(solver_t))
subject to: ||y_true - y_pred|| < Ξ΅_accuracy
Where:
t_cost(solver_t)is the computational cost of solver choice at time tΞ΅_accuracyis the maximum allowed error tolerance
The reward function balances multiple objectives:
reward = -Ξ± * computational_cost + Ξ² * accuracy_bonus - Ξ³ * switching_penalty- Computational Cost: Direct CPU time measurement
- Accuracy Bonus: Reward for maintaining solution quality
- Switching Penalalty: Discourage excessive solver switching
The RL agent observes:
- Current temperature and species concentrations
- Recent solver performance history
- Local stiffness indicators
- Simulation progress metrics
- Mechanism: GRI-Mech 3.0 (53 species, 325 reactions)
- Fuel: Methane (CHβ)
- Oxidizer: Air (Nβ:Oβ = 3.76:1)
- Conditions: T = 1000-1400K, P = 1-60 atm, Ο = 0.5-2.0
- Algorithm: PPO with clipped objective
- Network: 2-layer MLP (256 hidden units)
- Learning Rate: 3e-4
- Batch Size: 64
- Training Episodes: 10,000+
Typical performance improvements:
- Speedup: 2-5x faster than CVODE-only
- Accuracy: Maintains <1% error vs. reference solution
- Adaptability: Learns domain-specific switching patterns
The RL agent discovers intuitive switching patterns:
- QSS: During slow chemistry phases (low temperature)
- CVODE: During ignition and fast chemistry (high temperature)
- Adaptive: Based on local stiffness and error indicators
# Basic functionality test
python simple_test.py
# Training test (short run)
python ppo_training.py --timesteps 1000 --eval-freq 500
# Full benchmark
python simple_test.py --benchmark --save-resultsfrom reward_model import LagrangeReward1
# Define custom reward
class CustomReward(LagrangeReward1):
def __init__(self, accuracy_weight=1.0, cost_weight=0.1):
super().__init__(accuracy_weight, cost_weight)
def compute_reward(self, state, action, next_state):
# Your custom reward logic
return rewardenv = IntegratorSwitchingEnv(
mechanism_file='your_mechanism.yaml',
temp_range=(800, 2000), # Custom temperature range
phi_range=(0.3, 3.0), # Custom equivalence ratio
pressure_range=(0.1, 100), # Custom pressure range
dt_range=(1e-7, 1e-3), # Custom time step range
reward_function=CustomReward()
)-
QSS Method: Mott, D., Oran, E., & van Leer, B. (2000). A Quasi-Steady-State Solver for the Stiff Ordinary Differential Equations of Reaction Kinetics. Journal of Computational Physics, 164(2), 407-428.
-
PPO Algorithm: Schulman, J., et al. (2017). Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347.
-
Combustion Chemistry: Smith, G. P., et al. (1999). GRI-Mech 3.0. http://www.me.berkeley.edu/gri_mech/.
- Adaptive time-stepping in CFD
- Machine learning for scientific computing
- Reinforcement learning in engineering applications
This repository is part of ongoing research. For questions or collaboration:
- Issues: Report bugs or request features
- Discussions: Technical questions and research discussions
- Pull Requests: Code improvements and extensions
This project is licensed under the MIT License - see the LICENSE file for details.
- Cantera: Combustion chemistry library
- Gymnasium: RL environment framework
- PyTorch: Deep learning framework
- GRI-Mech: Combustion mechanism database
Note: This repository accompanies a journal publication on adaptive solver switching in combustion CFD. For the latest research results and detailed analysis, please refer to the associated publication.