DiscoRL: When Algorithms Learn to Design Algorithms
•15 min read
DeepMind's DiscoRL discovers reinforcement learning algorithms that outperform hand-designed methods like PPO and DQN. By treating algorithm design as a meta-learning problem, it found alternatives to value functions and bootstrapping through optimization alone.