Apache Kafka is a high-throughput distributed messaging system that can be used for building real-time data pipelines and streaming apps. It provides a publish-subscribe messaging model and is designed as a distributed commit log. Kafka allows for both push and pull models where producers push data and consumers pull data from topics which are divided into partitions to allow for parallelism.