Reinforcement learning agent for the board game Splendor, built with a custom Gymnasium environment and a PyTorch actor-critic network trained via self-play.
ai.Splendor/
├── game/ # Splendor game engine + Gymnasium environment (splendor-game)
└── ai/ # PyTorch model, agent, and training loop (splendor-ai)
Requires uv and Python ≥ 3.10.
uv syncThis installs both workspace packages (splendor-game and splendor-ai) in editable mode.
uv run python -m ai.train
# or via the entry point:
uv run splendor-train --episodes 5000 --lr 3e-4 --checkpoint-dir checkpointsCheckpoints are saved to checkpoints/ every 500 episodes by default.
from splendor.env import SplendorEnv
env = SplendorEnv(num_players=2)
obs, info = env.reset(seed=42)
print(obs.shape) # (146,)
print(len(info["legal_actions"]))