This document discusses scaling out logistic regression with Apache Spark. It describes the need to classify a large number of websites using machine learning. Several approaches to logistic regression were tried, including a single machine Java implementation and moving to Spark for better scalability. Spark's L-BFGS algorithm was chosen for its out of the box distributed logistic regression solution. Challenges implementing logistic regression at large scale are discussed, such as overfitting and regularization. Methods used to address these challenges include L2 regularization, cross-validation to select the regularization parameter, and extensions made to Spark's LBFGS implementation.