The K-means clustering algorithm partitions observations into K clusters by minimizing the distance between observations and cluster centroids. It works by randomly assigning observations to K clusters, calculating the distance between each observation and centroid, reassigning observations to their closest centroid, and repeating until cluster assignments are stable. Common distance measures used include Euclidean, squared Euclidean, and Manhattan distances. The algorithm aims to group similar observations together based on feature similarity to reduce the size of codebooks for applications like speech processing.