Universal Approximation Theorem
Linguistic Essentials for NLP
Understanding deep learning requires rethinking generalization
The marginal value of adaptive gradient methods in machine learning