Notice: Trying to access array offset on value of type bool in /home1/freesand/public_html/wp-content/plugins/wiki-embed/WikiEmbed.php on line 112

Notice: Trying to access array offset on value of type bool in /home1/freesand/public_html/wp-content/plugins/wiki-embed/WikiEmbed.php on line 112

Notice: Trying to access array offset on value of type bool in /home1/freesand/public_html/wp-content/plugins/wiki-embed/WikiEmbed.php on line 116
12 | 1 月 | 2019 | FreeSandal

樹莓派、樹莓派之學習、樹莓派之教育

OpenAI Gym 《四》

2019-01-12 懸鉤子

雖然樹莓派 3B+ 可以跑 Denny Britz 所寫的

reinforcement-learning/TD/

Model-Free Prediction & Control with Temporal Difference (TD) and Q-Learning

Learning Goals

Understand TD(0) for prediction
Understand SARSA for on-policy control
Understand Q-Learning for off-policy control
Understand the benefits of TD algorithms over MC and DP approaches
Understand how n-step methods unify MC and TD approaches
Understand the backward and forward view of TD-Lambda

筆記本︰

但是跑不了

reinforcement-learning/DQN/

Deep Q-Learning

Learning Goals

Understand the Deep Q-Learning (DQN) algorithm
Understand why Experience Replay and a Target Network are necessary to make Deep Q-Learning work in practice
(Optional) Understand Double Deep Q-Learning
(Optional) Understand Prioritized Experience Replay

呦！