Reinforcement Learning Specialization
Published:
Why There’s No Better Time to Learn RL
The recent DeepSeek hype showed the world what many of us already suspected: reinforcement learning is transforming how we build AI systems. Seeing DeepSeek implement RL algorithms to achieve remarkable performance improvements was my first real inspiration to dive deep into this field.
Two YouTube resources particularly sparked my RL learning journey:
- Gonkee’s detailed explanation provided incredible insights into the practical applications
- Mutual Information’s 6-video series offered a clear, intuitive understanding of core concepts
After absorbing these materials, I knew I needed formal training. I enrolled in this University of Alberta specialization and also purchased the legendary Sutton and Barto’s “Reinforcement Learning: An Introduction”. The fact that Richard Sutton and Andrew Barto won the Turing Award for their foundational work in RL further validated my decision to pursue this path seriously.
Course Topics
Fundamentals of Reinforcement Learning:
- k-armed bandit problems and exploration-exploitation trade-offs
- Finite Markov Decision Processes (MDPs)
Sample-based Learning Methods:
- Monte Carlo methods for learning from complete episodes
- Temporal-Difference (TD) learning for online learning
Prediction and Control with Function Approximation:
- Scaling RL to large state spaces
- Deep Q-Networks and value function approximation
Policy Gradient Methods:
- Direct policy optimization
- Deep Reinforcement Learning with neural networks
Application to My Research
This specialization is building the theoretical foundation I need for my research in federated learning and distributed systems. I’m particularly interested in:
- Formulating extreme client heterogeneity as multi-armed bandit problems
- Applying RL frameworks for adaptive client selection in federated learning
- Understanding sequential decision-making in complex, heterogeneous environments
The intersection of RL and federated learning is where I see tremendous potential for innovation, and this specialization is equipping me with the tools to explore that frontier.
Status: Currently enrolled and actively learning
Expected Completion: December 2025 or January 2026
