DeepMind Reinforcement Learning

Published: February 09, 2026

DeepMind Reinforcement Learning

Course Website

Course Website

Course Overview

It’s my first course about reinforcement learning. This course is offered by DeepMind and University College London (UCL). The course covers fundamental concepts of reinforcement learning, including Markov decision processes, dynamic programming, model-free prediction and control, value function approximation, policy gradient methods, and exploration-exploitation trade-offs. The course also includes practical applications and case studies in reinforcement learning.

There is a slide which impresses me a lot. In the last lecture, the instructor, David Silver showed that the Go was still at the stage of grandmaster level. That was in 2015. And just one year later, he and the DeepMind team published the AlphaGo, which defeated the world champion Lee Sedol in 2016 and finally achieved superhuman level.

My Learning Journey

2026.1.21 Lecture 1: Introduction to Reinforcement Learning
2026.1.29 Lecture 2: Markov Decision Processes
2026.1.30 Lecture 3: Planning by Dynamic Programming
2026.2.3 Lecture 4: Model-Free Prediction
2026.2.4 Lecture 5: Model-Free Control
2026.2.5 Lecture 6: Value Function Approximation
2026.2.6 Lecture 7: Policy Gradient Methods
2026.2.7 Lecture 8: Integrating Learning and Planning
2026.2.8 Lecture 9: Exploration and Exploitation
2026.2.9 Lecture 10: Case Study: RL in Classic Games

Kai Cheng

DeepMind Reinforcement Learning

Course Website

Course Overview

My Learning Journey