Hierarchical Clustering in Data Mining
Last Updated :
12 Dec, 2023
A Hierarchical clustering method works via grouping data into a tree of clusters. Hierarchical clustering begins by treating every data point as a separate cluster. Then, it repeatedly executes the subsequent steps:
- Identify the 2 clusters which can be closest together, and
- Merge the 2 maximum comparable clusters. We need to continue these steps until all the clusters are merged together.
In Hierarchical Clustering, the aim is to produce a hierarchical series of nested clusters. A diagram called Dendrogram (A Dendrogram is a tree-like diagram that statistics the sequences of merges or splits) graphically represents this hierarchy and is an inverted tree that describes the order in which factors are merged (bottom-up view) or clusters are broken up (top-down view).
What is Hierarchical Clustering?
Hierarchical clustering is a method of cluster analysis in data mining that creates a hierarchical representation of the clusters in a dataset. The method starts by treating each data point as a separate cluster and then iteratively combines the closest clusters until a stopping criterion is reached. The result of hierarchical clustering is a tree-like structure, called a dendrogram, which illustrates the hierarchical relationships among the clusters.
Hierarchical clustering has several advantages over other clustering methods
- The ability to handle non-convex clusters and clusters of different sizes and densities.
- The ability to handle missing data and noisy data.
- The ability to reveal the hierarchical structure of the data, which can be useful for understanding the relationships among the clusters.
Drawbacks of Hierarchical Clustering
- The need for a criterion to stop the clustering process and determine the final number of clusters.
- The computational cost and memory requirements of the method can be high, especially for large datasets.
- The results can be sensitive to the initial conditions, linkage criterion, and distance metric used.
In summary, Hierarchical clustering is a method of data mining that groups similar data points into clusters by creating a hierarchical structure of the clusters. - This method can handle different types of data and reveal the relationships among the clusters. However, it can have high computational cost and results can be sensitive to some conditions.
Types of Hierarchical Clustering
Basically, there are two types of hierarchical Clustering:
- Agglomerative Clustering
- Divisive clustering
1. Agglomerative Clustering
Initially consider every data point as an individual Cluster and at every step, merge the nearest pairs of the cluster. (It is a bottom-up method). At first, every dataset is considered an individual entity or cluster. At every iteration, the clusters merge with different clusters until one cluster is formed.
The algorithm for Agglomerative Hierarchical Clustering is:
- Calculate the similarity of one cluster with all the other clusters (calculate proximity matrix)
- Consider every data point as an individual cluster
- Merge the clusters which are highly similar or close to each other.
- Recalculate the proximity matrix for each cluster
- Repeat Steps 3 and 4 until only a single cluster remains.
Let's see the graphical representation of this algorithm using a dendrogram.
Note: This is just a demonstration of how the actual algorithm works no calculation has been performed below all the proximity among the clusters is assumed.
Let's say we have six data points A, B, C, D, E, and F.
Agglomerative Hierarchical clustering- Step-1: Consider each alphabet as a single cluster and calculate the distance of one cluster from all the other clusters.
- Step-2: In the second step comparable clusters are merged together to form a single cluster. Let's say cluster (B) and cluster (C) are very similar to each other therefore we merge them in the second step similarly to cluster (D) and (E) and at last, we get the clusters [(A), (BC), (DE), (F)]
- Step-3: We recalculate the proximity according to the algorithm and merge the two nearest clusters([(DE), (F)]) together to form new clusters as [(A), (BC), (DEF)]
- Step-4: Repeating the same process; The clusters DEF and BC are comparable and merged together to form a new cluster. We’re now left with clusters [(A), (BCDEF)].
- Step-5: At last, the two remaining clusters are merged together to form a single cluster [(ABCDEF)].
2. Divisive Hierarchical clustering
We can say that Divisive Hierarchical clustering is precisely the opposite of Agglomerative Hierarchical clustering. In Divisive Hierarchical clustering, we take into account all of the data points as a single cluster and in every iteration, we separate the data points from the clusters which aren't comparable. In the end, we are left with N clusters.
Divisive Hierarchical clusteringAlso Check:
Similar Reads
Everything You Need to Know About Data Lineage
What is data lineage? What are the characteristics of data lineage? What are the uses of data lineage? What are the methods of data lineage collection? If you are looking for answers to the above questions, you have come to the right place. In this article, we will go into detail about everything yo
8 min read
Introduction to Data Processing
Data processing, the conversion of raw data into meaningful information, is pivotal in today's information-driven world. The Data Processing process is vital across various sectors, from business and science to real-time applications, shaping the way we interpret and utilize information. In this art
8 min read
Types of Algorithms in Pattern Recognition
At the center of pattern recognition are various algorithms designed to process and classify data. These can be broadly classified into statistical, structural and neural network-based methods. Pattern recognition algorithms can be categorized as: Statistical Pattern Recognition â Based on probabili
5 min read
What are the Main Components of Data Science?
Data science is an interdisciplinary field that uses scientific techniques, procedures, algorithms, and structures to extract know-how and insights from established and unstructured information. This article explores the integral components of data science, from data collection to programming langua
6 min read
Pros and Cons of Decision Tree Regression in Machine Learning
Decision tree regression is a widely used algorithm in machine learning for predictive modeling tasks. It is a powerful tool that can handle both classification and regression problems, making it versatile for various applications. However, like any other algorithm, decision tree regression has its
5 min read
DIKW Pyramid | Data, Information, Knowledge and Wisdom | Data Science and Big Data Analytics
The term DIKW is derived from the field of "data science and big data analytics". The DIKW model is used for data enrichment. The DIKW model consists of four stages. The full form of every alphabet in the word DIKW has its own meaning. In DIKW, D stands for "Data", I stands for "Information", K stan
2 min read
Hierarchical Clustering in R Programming
Hierarchical clustering in R Programming Language is an Unsupervised non-linear algorithm in which clusters are created such that they have a hierarchy(or a pre-determined ordering). For example, consider a family of up to three generations. A grandfather and mother have their children that become f
3 min read
Hierarchical Clustering in Machine Learning
In the real world data often lacks a target variable making supervised learning impractical. Have you ever wondered how social networks like Facebook recommend friends or how scientists group similar species together? These are some examples of hierarchical clustering that we will learn about in thi
7 min read
Hierarchical clustering using Weka
In this article, we will see how to utilize the Weka Explorer to perform hierarchical analysis. The sample data set for this example is based on iris data in ARFF format. The data has been appropriately preprocessed, as this article expects. This dataset has 150 iris occurrences. Clustering: Cluster
3 min read
Types of Linkages in Hierarchical Clustering
Hierarchical clustering is used to group similar data points and organise data in a tree-like structure. Key part of this process is linkage which calculates the distance between clusters before they are merged or divided. Different types of linkage is used measure this distance differently. In this
3 min read