Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation
5 Mins read
In cooperative multi-agent reinforcement learning (MARL), due to its on-policy nature, policy gradient (PG) methods are typically believed to be less sample…