DeepMind Reinforcement Learning

Published:

UMich EECS 498-007 / 598-005: Deep Learning for Computer Vision

Course Website

Course Overview

It’s my first course about reinforcement learning. This course is offered by DeepMind and University College London (UCL). The course covers fundamental concepts of reinforcement learning, including Markov decision processes, dynamic programming, model-free prediction and control, value function approximation, policy gradient methods, and exploration-exploitation trade-offs. The course also includes practical applications and case studies in reinforcement learning.

There is a slide which impresses me a lot. In the last lecture, the instructor, David Silver showed that the Go was still at the stage of grandmaster level. That was in 2015. And just one year later, he and the DeepMind team published the AlphaGo, which defeated the world champion Lee Sedol in 2016 and finally achieved superhuman level.

My Learning Journey

  • 2026.1.21 Lecture 1: Introduction to Reinforcement Learning
  • 2026.1.29 Lecture 2: Markov Decision Processes
  • 2026.1.30 Lecture 3: Planning by Dynamic Programming
  • 2026.2.3 Lecture 4: Model-Free Prediction
  • 2026.2.4 Lecture 5: Model-Free Control
  • 2026.2.5 Lecture 6: Value Function Approximation
  • 2026.2.6 Lecture 7: Policy Gradient Methods
  • 2026.2.7 Lecture 8: Integrating Learning and Planning
  • 2026.2.8 Lecture 9: Exploration and Exploitation
  • 2026.2.9 Lecture 10: Case Study: RL in Classic Games

Tags: