Skip to content

DMO-LAB/combustionRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

46 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CombustionRL: Reinforcement Learning for Adaptive Solver Selection in Combustion Simulations

Python License Research

A reinforcement learning framework for intelligently switching between CVODE and Quasi-Steady State (QSS) solvers during combustion chemistry simulations to optimize computational efficiency while maintaining accuracy constraints.

🎯 Overview

This project implements a Proximal Policy Optimization (PPO) agent that learns to dynamically switch between two numerical integrators:

  • CVODE: Robust but computationally expensive implicit solver
  • QSS: Fast quasi-steady state solver for stiff systems

The RL agent learns optimal switching strategies to minimize computational cost while maintaining solution accuracy, particularly valuable for large-scale CFD combustion simulations.

πŸ”¬ Research Context

This work addresses a critical challenge in computational fluid dynamics (CFD) combustion simulations:

  • Problem: Traditional fixed-solver approaches are either too slow (CVODE everywhere) or inaccurate (QSS everywhere)
  • Solution: Adaptive solver switching based on local solution characteristics
  • Innovation: RL learns optimal switching policies from experience rather than hand-crafted rules

πŸ—οΈ Architecture

Core Components

CombustionRL/
β”œβ”€β”€ environment.py          # RL environment for solver switching
β”œβ”€β”€ ppo_training.py         # PPO training pipeline
β”œβ”€β”€ utils.py               # Solver utilities and integration
β”œβ”€β”€ reward_model.py         # Custom reward functions
β”œβ”€β”€ simple_test.py         # Benchmarking and testing
β”œβ”€β”€ notebooks/             # Jupyter notebooks for analysis
β”œβ”€β”€ logs/                  # Training logs and checkpoints
└── test_results/          # Performance evaluation results

RL Environment (environment.py)

The IntegratorSwitchingEnv provides a Gymnasium-compatible environment:

  • State Space: Temperature, species concentrations, solver history
  • Action Space: Discrete choice between CVODE (0) and QSS (1)
  • Reward Function: Balances computational cost vs. accuracy
  • Termination: Based on simulation completion or steady-state detection

Training Pipeline (ppo_training.py)

Complete PPO implementation with:

  • Policy and value networks
  • Experience collection and training
  • Comprehensive logging and monitoring
  • Checkpoint management
  • Performance evaluation

πŸš€ Quick Start

Prerequisites

# Core dependencies
pip install -r requirements.txt

# QSS integrator (from our published package)
pip install qss-integrator

# CVODE solver (if available)
pip install SundialsPy  # Optional

Basic Usage

from environment import IntegratorSwitchingEnv
from ppo_training import PPOTrainer

# Create environment
env = IntegratorSwitchingEnv(
    mechanism_file='gri30.yaml',
    temp_range=(1000, 1400),
    phi_range=(0.5, 2.0),
    pressure_range=(1, 60)
)

# Train PPO agent
trainer = PPOTrainer(env)
trainer.train(total_timesteps=100000)

# Evaluate trained agent
results = trainer.evaluate(num_episodes=100)

Benchmarking Solvers

from simple_test import benchmark_solver

# Compare CVODE vs QSS performance
cvode_results = benchmark_solver('cvode', config, temperature, pressure)
qss_results = benchmark_solver('qss', config, temperature, pressure)

# Analyze computational efficiency
print(f"CVODE: {cvode_results['cpu_time']:.3f}s")
print(f"QSS: {qss_results['cpu_time']:.3f}s")

πŸ“Š Methodology

Problem Formulation

Objective: Minimize computational cost while maintaining solution accuracy

minimize: Ξ£(t_cost(solver_t))
subject to: ||y_true - y_pred|| < Ξ΅_accuracy

Where:

  • t_cost(solver_t) is the computational cost of solver choice at time t
  • Ξ΅_accuracy is the maximum allowed error tolerance

Reward Function Design

The reward function balances multiple objectives:

reward = -Ξ± * computational_cost + Ξ² * accuracy_bonus - Ξ³ * switching_penalty
  • Computational Cost: Direct CPU time measurement
  • Accuracy Bonus: Reward for maintaining solution quality
  • Switching Penalalty: Discourage excessive solver switching

State Representation

The RL agent observes:

  • Current temperature and species concentrations
  • Recent solver performance history
  • Local stiffness indicators
  • Simulation progress metrics

πŸ”¬ Experimental Setup

Combustion Chemistry

  • Mechanism: GRI-Mech 3.0 (53 species, 325 reactions)
  • Fuel: Methane (CHβ‚„)
  • Oxidizer: Air (Nβ‚‚:Oβ‚‚ = 3.76:1)
  • Conditions: T = 1000-1400K, P = 1-60 atm, Ο† = 0.5-2.0

Training Configuration

  • Algorithm: PPO with clipped objective
  • Network: 2-layer MLP (256 hidden units)
  • Learning Rate: 3e-4
  • Batch Size: 64
  • Training Episodes: 10,000+

πŸ“ˆ Results & Performance

Computational Efficiency

Typical performance improvements:

  • Speedup: 2-5x faster than CVODE-only
  • Accuracy: Maintains <1% error vs. reference solution
  • Adaptability: Learns domain-specific switching patterns

Learned Strategies

The RL agent discovers intuitive switching patterns:

  • QSS: During slow chemistry phases (low temperature)
  • CVODE: During ignition and fast chemistry (high temperature)
  • Adaptive: Based on local stiffness and error indicators

πŸ› οΈ Development

Running Tests

# Basic functionality test
python simple_test.py

# Training test (short run)
python ppo_training.py --timesteps 1000 --eval-freq 500

# Full benchmark
python simple_test.py --benchmark --save-results

Customization

Custom Reward Functions

from reward_model import LagrangeReward1

# Define custom reward
class CustomReward(LagrangeReward1):
    def __init__(self, accuracy_weight=1.0, cost_weight=0.1):
        super().__init__(accuracy_weight, cost_weight)
    
    def compute_reward(self, state, action, next_state):
        # Your custom reward logic
        return reward

Environment Configuration

env = IntegratorSwitchingEnv(
    mechanism_file='your_mechanism.yaml',
    temp_range=(800, 2000),      # Custom temperature range
    phi_range=(0.3, 3.0),        # Custom equivalence ratio
    pressure_range=(0.1, 100),    # Custom pressure range
    dt_range=(1e-7, 1e-3),       # Custom time step range
    reward_function=CustomReward()
)

πŸ“š References

Key Papers

  1. QSS Method: Mott, D., Oran, E., & van Leer, B. (2000). A Quasi-Steady-State Solver for the Stiff Ordinary Differential Equations of Reaction Kinetics. Journal of Computational Physics, 164(2), 407-428.

  2. PPO Algorithm: Schulman, J., et al. (2017). Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347.

  3. Combustion Chemistry: Smith, G. P., et al. (1999). GRI-Mech 3.0. http://www.me.berkeley.edu/gri_mech/.

Related Work

  • Adaptive time-stepping in CFD
  • Machine learning for scientific computing
  • Reinforcement learning in engineering applications

🀝 Contributing

This repository is part of ongoing research. For questions or collaboration:

  • Issues: Report bugs or request features
  • Discussions: Technical questions and research discussions
  • Pull Requests: Code improvements and extensions

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Cantera: Combustion chemistry library
  • Gymnasium: RL environment framework
  • PyTorch: Deep learning framework
  • GRI-Mech: Combustion mechanism database

Note: This repository accompanies a journal publication on adaptive solver switching in combustion CFD. For the latest research results and detailed analysis, please refer to the associated publication.

About

A reinforcement learning framework for intelligently switching between CVODE and Quasi-Steady State (QSS) solvers during combustion chemistry simulations to optimize computational efficiency while maintaining accuracy constraints.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors