Course Overview

Instructor: DOROZHKO Anton

Reinforcement Learning Introduction course for Big Data Analytics program at Novosibirsk state University This course covers basics and some advanced themes of RL and Deep RL.

Course Pre-Requisites

  • Probability theory
  • Python
  • Tensorflow / PyTorch

Course Staff

For doubts related to assignments, send an e-mail to dorozhko.a@gmail.com. For sensitive issues, please e-mail DOROZHKO Anton directly.

Always put this prefix in the name of the email [NSU_RL101_2020]


Schedule

Lesson Topic Summary
Introduction to RL & course overview
Lesson 1 12/05/2020

Introduction to Reinforcement Learning

Slides


References

DOROZHKO Anton

Lesson 2 16/05/2020

Markov Decision Processes, Value and Policy Iteration

Slides


References

DOROZHKO Anton

Lesson 3 20/05/2019

Model-Free Q-learning with TD and MC

Slides


References

  • continue with Value and Policy Iteration (Colab)

DOROZHKO Anton

Lesson 4 23/05/2019

Model Free Control

Slides


References

DOROZHKO Anton

Lesson 5 27/05/2019

Value Function Approximation

Slides


DOROZHKO Anton

Lesson 6 30/05/2019

Multi-Armed Bandits

Slides


References

DOROZHKO Anton

Lesson 6 30/05/2019

Policy Gradient

Slides


References

DOROZHKO Anton

Grading Policy

Mapping to notes is comming soon

  • Lab sessions
  • Project : review of proposed article (in groups of 2)

To get the credits for this course you must finish all labs and submit a project.

To send a task you can share you colab notebook using email / piazza. Add prefix to the subject [NSU_RL101_2020] and your name

Requirements Note
5 labs + project 5
4 labs + project 4
3 labs + project 3
DEADLINE for all labs and the project : 12.06.2020 19:00 (GMT+7, Novosibirsk time)

Communication

We believe students often learn an enormous amount from each other as well as from us, the course staff. Therefore to facilitate discussion and peer learning, we request that you please use Piazza for all questions related to lectures, homeworks and projects.

You will be awarded with up to 5% extra credit if you answer other students' questions in a substantial and helpful way on Piazza .

Academic Collaboration & Misconduct

I care about academic collaboration and misconduct because it is important both that we are able to evaluate your own work (independent of your peer’s) and because not claiming others’ work as your own is an important part of integrity in your future career. I understand that different institutions and locations can have different definitions of what forms of collaborative behavior is considered acceptable. In this class, for written homework problems, you are welcome to discuss ideas with others, but you are expected to write up your own solutions independently (without referring to another’s solutions). For coding, you are allowed to do projects in groups of 2, but for any other collaborations, you may only share the input-output behavior of your programs. This encourages you to work separately but share ideas on how to test your implementation. Please remember that if you share your solution with another student, even if you did not copy from another, you are still violating the honor code. In terms of the final project, you are welcome to combine this project with another class assuming that the project is relevant to both classes, given that you take prior permission of the class instructors. If your project is an extension of a previous class project, you are expected to make significant additional contributions to the project.

References

  1. Sutton & Barto, Reinforcement Learning: An Introduction
  2. Bertsekas, Dynamic Programming and Optimal Control, Vols I and II
  3. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming

Other courses

  1. Stanford CS234
  2. Berkeley RL
  3. DeepMind lectures
  4. David Silver's course
  5. Yandex Practical RL
  6. Udacity Deep RL Nanodegree
  7. OpenAI Spinning Up RL research
  8. A (Long) Peek into Reinforcement Learning
  9. Introduction to Multi-Armed Bandits (Aleksandrs Slivkins)
-->