Skip to content

Latest commit

 

History

History
39 lines (32 loc) · 4.45 KB

ES for RL.md

File metadata and controls

39 lines (32 loc) · 4.45 KB

Combining Evolution and Deep Reinforcement Learning for Policy Search: a Survey

Contributors:

Policy Search

  1. PEPG: Parameter-exploring policy gradients, Sehnke F et al, 2010, Neural Networks.
  2. NES: Natural evolution strategies, Wierstra D et al, 2014, The Journal of Machine Learning Research.
  3. OpenAI-ES: Evolution strategies as a scalable alternative to reinforcement learning, Salimans T et al, 2017.
  4. GA: Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning, Such F P et al, 2017.
  5. NS-ES/NSR-ES/NSRA-ES: Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents, Conti E et al, 2018, NeurIPS.
  6. TRES: Trust region evolution strategies, Liu G et al, 2019, AAAI.
  7. Guided ES: Guided evolutionary strategies: Augmenting random search with surrogate gradients, Maheswaranathan N et al, 2019, ICML.
  8. PBT: Population based training of neural networks, Jaderberg M et al, 2017.
  9. PB2: Provably efficient online hyperparameter optimization with population-based bandits, Parker-Holder J et al, 2020, Advances in Neural Information Processing Systems.
  10. SEARL: Sample-efficient automated deep reinforcement learning, Franke J K H et al, 2020.
  11. DERL: Embodied intelligence via learning and evolution, Gupta A et al, 2021, Nature communications.

Experience-guided

  1. ERQL: Bootstrapping $ q $-learning for robotics from neuro-evolution results, Zimmer M et al, 2017, IEEE.
  2. GRP-PG: Gep-pg: Decoupling exploration and exploitation in deep reinforcement learning algorithms, Colas C et al, 2018, ICML.
  3. ERL: Evolution-guided policy gradient in reinforcement learning, Khadka S et al, 2018, NeurIPS.
  4. CEM-RL: CEM-RL: Combining evolutionary and gradient-based methods for policy search, Pourchot A et al, 2018.
  5. CERL: Collaborative evolutionary reinforcement learning, Khadka S et al, 2019, ICML.
  6. PDERL: Proximal distilled evolutionary reinforcement learning, Bodnar C et al, 2020, AAAI.
  7. RIM: Recruitment-imitation mechanism for evolutionary reinforcement learning, Lü S et al, 2021, Information Sciences.
  8. ESAC: Maximum mutation reinforcement learning for scalable control, Suri K et al, 2020.
  9. QD-RL: Qd-rl: Efficient mixing of quality and diversity in reinforcement learning, Cideron G et al, 2020.
  10. SUPE-RL: Genetic soft updates for policy evolution in deep reinforcement learning, Marchesini E et al, 2020, ICLR.

Modules-embedded

  1. PPO-CMA: PPO-CMA: Proximal policy optimization with covariance matrix adaptation, Hämäläinen P et al, 2020, IEEE.
  2. EPG: Evolved policy gradients, Houthooft R et al, 2018, NeurIPS.
  3. CGP: Q-learning for continuous actions with cross-entropy guided policies, Simmons-Edler R et al, 2019.
  4. GRAC: Grac: Self-guided and self-regularized actor-critic, Shao L et al, 2022, CoRL.