Apache Spark is increasingly adopted as an alternate processing framework to MapReduce, due to its ability to speed up batch, interactive and streaming analytics. Spark enables new analytics use cases like machine learning and graph analysis with its rich and easy to use programming libraries. And, it offers the flexibility to run analytics on data stored in Hadoop, across data across object stores and within traditional databases. This makes Spark an ideal platform for accelerating cross-platform analytics on-premises and in the cloud. Building on the success of Spark 1.x release, Spark 2.x delivers major improvements in the areas of API, Performance, and Structured Streaming. In this paper, we will cover a high-level view of the Apache Spark framework, and then focus on what we consider to be very important improvements made in Apache Spark 2.x. We will then share the results of a real-world benchmark effort and share details on Spark and environment configuration changes made to our lab, discuss the results of the benchmark, and provide a reference architecture example for those interested in taking Spark 2.x for their own test drive. This presentation stresses the value of refreshing the Spark 1 with Spark 2 as performance testing results show 2.3x improvement with SparkSQL workloads similar to TPC Benchmark™ DS (TPC-DS). MARK LOCHBIHLER, Principal Architect, Hortonworks and VIPLAVA MADASU, Big Data Systems Engineer, Hewlett Packard Enterprise