Apache Spark is an open-source cluster computing framework originally developed at UC Berkeley in 2009. It is faster than Hadoop for interactive queries and stream processing due to its use of caching and RAM. Spark supports functional programming APIs in Java, Scala, Python and R. It provides functionality for SQL processing, streaming, machine learning and graph processing. RDDs (Resilient Distributed Datasets) are Spark's primary abstraction, acting as fault-tolerant collections of data partitioned across a cluster.