This document describes the implementation of a discrete cosine transform (DCT) image compression algorithm on an FPGA. It begins with background on DCT and its use in image compression. It then discusses previous work on DCT implementations and their limitations. The document proposes a new DCT algorithm and architecture that uses fewer multipliers and less area on the FPGA. It presents the 4-stage algorithm and describes the architecture in detail. Simulation results on a test image show the design achieves a high processing speed of 171.185MHz while occupying a small area on the FPGA.