https://chenrudan.github.io/blog/2016/06/06/reinforcementlearninglesssion1.html 强烈推荐: http://karpathy.github.io/2016/05/31/rl/ http://www.cnblogs.com/mo-wang/p/4910855.html UCL Course on RL 值函数与Q函数计算的例子 http://vision.stanford.edu/teaching/cs231n/syllabus.html