Congestion-aware Multi-agent Reinforcement Learning for Wildfire Evacuation Routing
Keywords: Wildfire evacuation, Multi-agent reinforcement learning, Adaptive vehicle routing, Disaster response
Abstract. Conventional navigation systems often cause severe bottlenecks during mass wildfire evacuations by routing vehicles onto the same capacity-limited corridors while ignoring advancing flame fronts. This paper introduces a congestion-aware multi-agent reinforcement learning (MARL) framework that models each road intersection as an independent Q-learning agent to balance route efficiency with strict hazard avoidance. During deployment, a batch-sequential mechanism dynamically adjusts these learned policies using real-time traffic, inherently dispersing vehicles away from overloaded roads. Evaluated on the real-world road network and parcel data of Lytton, British Columbia, the framework reduces peak edge congestion by 74% and achieves complete fire-zone avoidance compared to conventional fastest-path algorithms. With only a 7.4% increase in mean travel distance, these results demonstrate that distributed MARL policies yield significantly safer, more balanced, and highly scalable evacuation flows.
