Q-Learning and Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IMS/Bernoulli: July, 2004.
Nash Q-Learning for General-Sum Stochastic Games Hu & Wellman March 6 th, 2006 CS286r Presented by Ilan Lobel.
Q-Learning and Dynamic Treatment Regimes