Optimistic Initialization and Greediness Lead to Polynomial-Time Learning in Factored MDPs
Essential Environmental Engineering & Water Science