This document provides an overview of parallel and distributed computing using GPUs. It discusses GPU architecture and how GPUs are designed for massively parallel processing using hundreds of smaller cores compared to CPUs which use 4-8 larger cores. The document also covers GPU memory hierarchy, programming GPUs using OpenCL, and key concepts like work items, work groups, and occupancy which is keeping GPU compute units busy with work to process.