The document summarizes optimization algorithms for machine learning applications. It discusses first-order methods like gradient descent, accelerated methods like Nesterov's algorithm, and non-monotone methods like Barzilai-Borwein. Gradient descent converges at a rate of 1/k, while methods like heavy-ball, conjugate gradient, and Nesterov's algorithm can achieve faster linear or 1/k^2 convergence rates depending on the problem structure. The document provides convergence analysis and rate results for various first-order optimization algorithms applied to machine learning problems.