This document discusses bandit algorithms and their applications. It begins with an overview of multi-armed bandit problems and explores various algorithms to solve them, including naive, epsilon-greedy, and Thompson sampling algorithms. Thompson sampling is shown to have logarithmic regret compared to linear regret for other algorithms. The document then discusses contextual bandits, which incorporate context into the problem to select the best arm given context, and adversarial contextual bandits, where payoffs may change. Real-world applications of bandits include medical treatments, testing, optimization, and recommendations. Contextual bandits can be implemented using APIs, databases, and services. Bandit algorithms can solve many problems modeled as games.