This document describes a proposed low power and high speed 4-bit multiplier design using the Dadda algorithm and optimized building blocks. The Dadda algorithm is used to reduce the propagation delay by reducing the height of the partial product tree from 4 to 2 stages. Optimized 4T XOR gates and 14T full adders are used as building blocks to reduce power consumption. The design is implemented using DSCH 2 and Micro wind tools and compared to other multiplier architectures like array and Wallace tree multipliers.