Artificial neural networks (ANN) are one of the popular models of machine learning, in particular for deep learning. The models that are used in practice for image classification and speech recognition contain huge number of weights and are trained with big datasets. Training such models is challenging in terms of computation and data processing. We propose a scalable implementation of deep neural networks for Spark. We address the computational challenge by batch operations, using BLAS for vector and matrix computations and reusing the memory for reducing garbage collector activity. Spark provides data parallelism that enables scaling of training. As a result, our implementation is on par with widely used C++ implementations like Caffe on a single machine and scales nicely on a cluster. The developed API makes it easy to configure your own network and to run experiments with different hyper parameters. Our implementation is easily extensible and we invite other developers to contribute new types of neural network functions and layers. Also, optimizations that we applied and our experience with GPU CUDA BLAS might be useful for other machine learning algorithms being developed for Spark. The slides were presented at Spark SF Friends meetup on December 2, 2015 organized by Alex Khrabrov @Nitro. The content is based on my talk on Spark Summit Europe. However, there are few major updates: update and more details on the parallelism heuristic, experiments with larger cluster, as well as the new slide design.