Difference between revisions of "BSTE:ReinforcmentLearning"
Jump to navigation
Jump to search
R.sirgalina (talk | contribs) Tag: Manual revert |
R.sirgalina (talk | contribs) Tag: Manual revert |
||
Line 36: | Line 36: | ||
* Expected Sarsa and Q-Learning |
* Expected Sarsa and Q-Learning |
||
* Actor-Critic Method |
* Actor-Critic Method |
||
+ | === Course evaluation === |
||
+ | {| class="wikitable" |
||
+ | |+ Course grade breakdown |
||
+ | |- |
||
+ | ! Type !! Points |
||
+ | |- |
||
+ | | Labs/seminar classes || 20 |
||
+ | |- |
||
+ | | Interim performance assessment || 50 |
||
+ | |- |
||
+ | | Exams || 30 |
||
+ | |} |
||
+ | |||
+ | === Grades range === |
||
+ | {| class="wikitable" |
||
+ | |+ Course grading range |
||
+ | |- |
||
+ | ! Grade !! Points |
||
+ | |- |
||
+ | | A. Excellent || [85, 100] |
||
+ | |- |
||
+ | | B. Good || [70, 84] |
||
+ | |- |
||
+ | | C. Satisfactory || [55, 69] |
||
+ | |- |
||
+ | | D. Poor || [0, 54] |
||
+ | |} |
Revision as of 18:09, 19 November 2021
Reinforcement Learning
- Course name: Reinforcement Learning
Course Characteristics
Key concepts of the class
- Fundamentals of Reinforcement Learning
- Sample-based Learning Methods
- Prediction and Control with Function Approximation
What is the purpose of this course?
Harnessing the full potential of artificial intelligence requires adaptive learning systems. Reinforcement learning (RL) is one powerful paradigm for doing so, and it is relevant to an enormous range of tasks, including robotics, game playing, consumer modeling and healthcare.
Course objectives based on Bloom’s taxonomy
- What should a student remember at the end of the course?
By the end of the course, the students should be able to
- Markov Decision Processes
- Exploration vs. Exploitation
- Value Functions
- Temporal-difference Learning
- Q-learning
- Expected Sarsa
- Actor-Critic
- What should a student be able to understand at the end of the course?
By the end of the course, the students should be able to
- How to build an RL system for sequential decision making
- How to formalize a task as an RL problem
- the space of RL algorithms
- What should a student be able to apply at the end of the course?
By the end of the course, the students should be able to
- RL for solving real-world problems
- TD-algorithms for estimating value functions
- Expected Sarsa and Q-Learning
- Actor-Critic Method
Course evaluation
Type | Points |
---|---|
Labs/seminar classes | 20 |
Interim performance assessment | 50 |
Exams | 30 |
Grades range
Grade | Points |
---|---|
A. Excellent | [85, 100] |
B. Good | [70, 84] |
C. Satisfactory | [55, 69] |
D. Poor | [0, 54] |