Strategy Description
This poker agent uses a Q-learning-based model to make high-level decisions—specifically whether to fold, call, or raise—by evaluating the game state, including hole cards, community cards, betting history, and position. The neural network approximates Q-values for each action, and the agent selects the action with the highest expected reward.
If the chosen action is raise, the raise amount is dynamically adjusted using a Monte Carlo simulation to estimate the current win rate. The raise amount is scaled between the minimum and maximum allowed values, depending on the confidence in winning the hand.
This approach combines reinforcement learning for strategic decision-making with win-rate-based heuristics for precise betting control, allowing the agent to adapt its aggression level based on hand strength while staying within the learned policy.
Log in or sign up for Devpost to join the conversation.