SlideShare a Scribd company logo
The K-means Clustering
Algorithm 1
K-means is a method of clustering observations into a specific number of disjoint clusters.
The ”K” refers to the number of clusters specified. Various distance measures exist to deter-
mine which observation is to be appended to which cluster. The algorithm aims at minimiz-
ing the measure between the centroide of the cluster and the given observation by iteratively
appending an observation to any cluster and terminate when the lowest distance measure
is achieved.
1.1 Overview Of Algorithm
1. The sample space is intially partitioned into K clusters and the observations are ran-
domly assigned to the clusters.
2. For each sample:
• Calculate the distance from the observation to the centroide of the cluster.
• IF the sample is closest to its own cluster THEN leave it ELSE select another
cluster.
3. Repeat steps 1 and 2 untill no observations are moved from one cluster to another
When step 3 terminates the clusters are stable and each sample is assigned a cluster which
results in the lowest possible distance to the centriode of the cluster.
Common distance measures include the Euclidean distance, the Euclidean squared distance
and the Manhattan or City distance.
The Euclidean measure corresponds to the shortest geometric distance between to points.
d =
N
∑
i=1
(xi −yi)2
1
(19.1)
19.2 Distance measures
http://bit.ly/2Mub6xP
19.1
A faster way of determining the distance is by use of the squared Euclidean distance which
calculates the above distance squared, i.e.
dsq =
N
∑
i=1
(xi −yi)2
The Manhattan measure calculates a distance between points based on a grid and is illus-
Euclidean measure Manhattan measure
For applications in speech processing the squared Euclidean distance is widely used.
K-means can be used to cluster the extracted features from speech signals. The extracted
features from the signal include for instance mel frequency cepstral coefficients or line spec-
trum pairs. This allows speech signals with similar spectral characteristics to be positioned
into the same position in the codebook. In this way similar narrow band signals will be
predicted likewise thereby limiting the size of the codebook.
The following figures illustrate the K-means algoritm on a 2-dimensional data set.
2
Figure 19.1:Comparision between theEuclideanandtheManhattan measure.
CHAPTER 19. THE K-MEANS CLUSTERING ALGORITHM
(19.2)
trated in Figure 19.1.
19.3 Application of K-means
19.4 Example of K-means Clustering
http://bit.ly/2Mub6xP
−20 −15 −10 −5 0 5 10 15 20
−20
−15
−10
−5
0
5
10
15
20
−20 −15 −10 −5 0 5 10 15 20
−20
−15
−10
−5
0
5
10
15
20
marked with a cross.
3
Figure 19.2:Exampleof signal datamadefrom GaussianWhiteNoise.
Figure 19.3:Thesignal dataareseperated intoseven clusters.Thecentroidsare
19.4. EXAMPLE OF K-MEANS CLUSTERING
http://bit.ly/2Mub6xP
0 0.2 0.4 0.6 0.8 1
1
2
3
4
5
6
7
Silhouette Value
Cluster
into the seven clusters. If the distance from one point to two centroids is the
same, it means the point could belong to both centroids. The result is a conflict
which gives a negative value in the Silhouette diagram. The positive part of
the Silhuoette diagram, shows that there is a clear seperation of the points
between the clusters.
4
Figure 19.4:TheSilhouettediagramshowshow well thedataareseperated
CHAPTER 19. THE K-MEANS CLUSTERING ALGORITHM
http://bit.ly/2Mub6xP
1 close a l l
2 clear a l l
3 clc
4
5 Limit = 2 0 ;
6
7 X = [ 1 0 ∗ randn (400 ,2) ; 1 0 ∗ randn (400 ,2) ] ;
8 plot (X ( : , 1 ) ,X ( : , 2 ) , ’k . ’ )
9 length (X ( : , 1 ) )
10 figure
11 %i =1;
12 k=1;
13 for i =1: length (X ( : , 1 ) )
14 i f ( sqrt (X( i , 1 ) ^2+X( i , 2 ) ^2) ) > Limit ;
15 X( i , 1 ) =0;
16 X( i , 2 ) =0;
17 else
18 Y( k , 1 ) =X( i , 1 ) ;
19 Y( k , 2 ) =X( i , 2 ) ;
20 k=k+1;
21 end
22 end
23 plot (Y ( : , 1 ) ,Y ( : , 2 ) , ’k . ’ )
24 figure
25
26 [ cidx , c t r s ] = kmeans (Y , 7 , ’ d i s t ’ , ’ sqEuclidean ’ , ’ rep ’ ,5 , ’ disp ’ , ’
f i n a l ’ , ’ EmptyAction ’ , ’ singleton ’ ) ;
27
28 plot (Y( cidx ==1 ,1) ,Y( cidx ==1 ,2) , ’ r . ’ , . . .
29 Y( cidx ==2 ,1) ,Y( cidx ==2 ,2) , ’b . ’ , c t r s ( : , 1 ) , c t r s ( : , 2 ) , ’ kx ’ ) ;
30
31 hold on
32 plot (Y( cidx ==3 ,1) ,Y( cidx ==3 ,2) , ’y . ’ ,Y( cidx ==4 ,1) ,Y( cidx ==4 ,2) , ’g . ’ )
;
33
34 hold on
35 plot (Y( cidx ==5 ,1) ,Y( cidx ==5 ,2) , ’ c . ’ ,Y( cidx ==6 ,1) ,Y( cidx ==6 ,2) , ’m. ’ )
;
36
37 hold on
38 plot (Y( cidx ==7 ,1) ,Y( cidx ==7 ,2) , ’k . ’ ) ;
39
40 figure
5
19.5. MATLAB SOURCE CODE
19.5 Matlab Source Code
http://bit.ly/2Mub6xP
41 [ silk , h]= s i l h o u e t t e (Y, cidx , ’ sqEuclidean ’ ) ;
42 mean( s i l k )
6
CHAPTER 19. THE K-MEANS CLUSTERING ALGORITHM
Checkout: http://bit.ly/2Mub6xP
Data Science Course Content:
Ad

More Related Content

What's hot (20)

Daa unit 1
Daa unit 1Daa unit 1
Daa unit 1
Abhimanyu Mishra
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimization
Hanya Mohammed
 
Dynamic Programming
Dynamic ProgrammingDynamic Programming
Dynamic Programming
Sahil Kumar
 
Introduction of VAE
Introduction of VAEIntroduction of VAE
Introduction of VAE
Ken'ichi Matsui
 
Turing Machine
Turing MachineTuring Machine
Turing Machine
AyAn KhAn
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
Amit Kumar Rathi
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
Joy Dutta
 
8 queen problem
8 queen problem8 queen problem
8 queen problem
NagajothiN1
 
0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM
0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM
0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM
Mrunal Patil
 
Circle & curve clipping algorithm
Circle & curve clipping algorithmCircle & curve clipping algorithm
Circle & curve clipping algorithm
Mohamed El-Serngawy
 
Automata theory -- NFA and DFA construction
Automata theory -- NFA and DFA  constructionAutomata theory -- NFA and DFA  construction
Automata theory -- NFA and DFA construction
Akila Krishnamoorthy
 
Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)
UMBC
 
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的機器學習入門 scikit-learn 連淡水阿嬤都聽得懂的機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
Cicilia Lee
 
Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...
Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...
Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...
Amrinder Arora
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
Vikas Jain
 
Encoding(sc)
Encoding(sc)Encoding(sc)
Encoding(sc)
GowriLatha1
 
Fuzzy Set
Fuzzy SetFuzzy Set
Fuzzy Set
Ehsan Hamzei
 
Mini Projects for Computer Science Engineering Students.pdf
Mini Projects for Computer Science Engineering Students.pdfMini Projects for Computer Science Engineering Students.pdf
Mini Projects for Computer Science Engineering Students.pdf
jagan477830
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimization
Hanya Mohammed
 
Dynamic Programming
Dynamic ProgrammingDynamic Programming
Dynamic Programming
Sahil Kumar
 
Turing Machine
Turing MachineTuring Machine
Turing Machine
AyAn KhAn
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
Joy Dutta
 
0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM
0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM
0/1 DYNAMIC PROGRAMMING KNAPSACK PROBLEM
Mrunal Patil
 
Circle & curve clipping algorithm
Circle & curve clipping algorithmCircle & curve clipping algorithm
Circle & curve clipping algorithm
Mohamed El-Serngawy
 
Automata theory -- NFA and DFA construction
Automata theory -- NFA and DFA  constructionAutomata theory -- NFA and DFA  construction
Automata theory -- NFA and DFA construction
Akila Krishnamoorthy
 
Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)
UMBC
 
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的機器學習入門 scikit-learn 連淡水阿嬤都聽得懂的機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
Cicilia Lee
 
Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...
Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...
Bron Kerbosch Algorithm - Presentation by Jun Zhai, Tianhang Qiang and Yizhen...
Amrinder Arora
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
Vikas Jain
 
Mini Projects for Computer Science Engineering Students.pdf
Mini Projects for Computer Science Engineering Students.pdfMini Projects for Computer Science Engineering Students.pdf
Mini Projects for Computer Science Engineering Students.pdf
jagan477830
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 

Similar to K-means Clustering Algorithm with Matlab Source code (20)

On the Odd Gracefulness of Cyclic Snakes With Pendant Edges
On the Odd Gracefulness of Cyclic Snakes With Pendant EdgesOn the Odd Gracefulness of Cyclic Snakes With Pendant Edges
On the Odd Gracefulness of Cyclic Snakes With Pendant Edges
GiselleginaGloria
 
Hierarchical matrix approximation of large covariance matrices
Hierarchical matrix approximation of large covariance matricesHierarchical matrix approximation of large covariance matrices
Hierarchical matrix approximation of large covariance matrices
Alexander Litvinenko
 
K means ALGORITHM IN MACHINE LEARNING.pptx
K means ALGORITHM IN MACHINE LEARNING.pptxK means ALGORITHM IN MACHINE LEARNING.pptx
K means ALGORITHM IN MACHINE LEARNING.pptx
angelinjeba6
 
DISTANCE TWO LABELING FOR MULTI-STOREY GRAPHS
DISTANCE TWO LABELING FOR MULTI-STOREY GRAPHSDISTANCE TWO LABELING FOR MULTI-STOREY GRAPHS
DISTANCE TWO LABELING FOR MULTI-STOREY GRAPHS
graphhoc
 
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
sipij
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning project
Lianli Liu
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
MadhuriMulik1
 
Image Segmentation with Fundamentals, Point, Line, and Edge Detection, Thres...
Image Segmentation  with Fundamentals, Point, Line, and Edge Detection, Thres...Image Segmentation  with Fundamentals, Point, Line, and Edge Detection, Thres...
Image Segmentation with Fundamentals, Point, Line, and Edge Detection, Thres...
khyatizalawadia29490
 
digital image processing classification.ppt
digital image processing classification.pptdigital image processing classification.ppt
digital image processing classification.ppt
ziaullahkhan60
 
Litvinenko low-rank kriging +FFT poster
Litvinenko low-rank kriging +FFT  posterLitvinenko low-rank kriging +FFT  poster
Litvinenko low-rank kriging +FFT poster
Alexander Litvinenko
 
Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...
Alexander Litvinenko
 
Time of arrival based localization in wireless sensor networks a non linear ...
Time of arrival based localization in wireless sensor networks  a non linear ...Time of arrival based localization in wireless sensor networks  a non linear ...
Time of arrival based localization in wireless sensor networks a non linear ...
sipij
 
ImageSegmentation (1).ppt
ImageSegmentation (1).pptImageSegmentation (1).ppt
ImageSegmentation (1).ppt
NoorUlHaq47
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.ppt
AVUDAI1
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.ppt
DEEPUKUMARR
 
Odd Harmonious Labelings of Cyclic Snakes
Odd Harmonious Labelings of Cyclic SnakesOdd Harmonious Labelings of Cyclic Snakes
Odd Harmonious Labelings of Cyclic Snakes
GiselleginaGloria
 
LADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFUL
LADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFULLADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFUL
LADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFUL
Fransiskeran
 
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
ArchiLab 7
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
AlaaZ
 
Algorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in Graphs
Algorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in GraphsAlgorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in Graphs
Algorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in Graphs
IJERA Editor
 
On the Odd Gracefulness of Cyclic Snakes With Pendant Edges
On the Odd Gracefulness of Cyclic Snakes With Pendant EdgesOn the Odd Gracefulness of Cyclic Snakes With Pendant Edges
On the Odd Gracefulness of Cyclic Snakes With Pendant Edges
GiselleginaGloria
 
Hierarchical matrix approximation of large covariance matrices
Hierarchical matrix approximation of large covariance matricesHierarchical matrix approximation of large covariance matrices
Hierarchical matrix approximation of large covariance matrices
Alexander Litvinenko
 
K means ALGORITHM IN MACHINE LEARNING.pptx
K means ALGORITHM IN MACHINE LEARNING.pptxK means ALGORITHM IN MACHINE LEARNING.pptx
K means ALGORITHM IN MACHINE LEARNING.pptx
angelinjeba6
 
DISTANCE TWO LABELING FOR MULTI-STOREY GRAPHS
DISTANCE TWO LABELING FOR MULTI-STOREY GRAPHSDISTANCE TWO LABELING FOR MULTI-STOREY GRAPHS
DISTANCE TWO LABELING FOR MULTI-STOREY GRAPHS
graphhoc
 
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
sipij
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning project
Lianli Liu
 
Image Segmentation with Fundamentals, Point, Line, and Edge Detection, Thres...
Image Segmentation  with Fundamentals, Point, Line, and Edge Detection, Thres...Image Segmentation  with Fundamentals, Point, Line, and Edge Detection, Thres...
Image Segmentation with Fundamentals, Point, Line, and Edge Detection, Thres...
khyatizalawadia29490
 
digital image processing classification.ppt
digital image processing classification.pptdigital image processing classification.ppt
digital image processing classification.ppt
ziaullahkhan60
 
Litvinenko low-rank kriging +FFT poster
Litvinenko low-rank kriging +FFT  posterLitvinenko low-rank kriging +FFT  poster
Litvinenko low-rank kriging +FFT poster
Alexander Litvinenko
 
Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...
Alexander Litvinenko
 
Time of arrival based localization in wireless sensor networks a non linear ...
Time of arrival based localization in wireless sensor networks  a non linear ...Time of arrival based localization in wireless sensor networks  a non linear ...
Time of arrival based localization in wireless sensor networks a non linear ...
sipij
 
ImageSegmentation (1).ppt
ImageSegmentation (1).pptImageSegmentation (1).ppt
ImageSegmentation (1).ppt
NoorUlHaq47
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.ppt
AVUDAI1
 
ImageSegmentation.ppt
ImageSegmentation.pptImageSegmentation.ppt
ImageSegmentation.ppt
DEEPUKUMARR
 
Odd Harmonious Labelings of Cyclic Snakes
Odd Harmonious Labelings of Cyclic SnakesOdd Harmonious Labelings of Cyclic Snakes
Odd Harmonious Labelings of Cyclic Snakes
GiselleginaGloria
 
LADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFUL
LADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFULLADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFUL
LADDER AND SUBDIVISION OF LADDER GRAPHS WITH PENDANT EDGES ARE ODD GRACEFUL
Fransiskeran
 
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
ArchiLab 7
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
AlaaZ
 
Algorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in Graphs
Algorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in GraphsAlgorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in Graphs
Algorithmic Aspects of Vertex Geo-dominating Sets and Geonumber in Graphs
IJERA Editor
 
Ad

More from gokulprasath06 (7)

Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
gokulprasath06
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
gokulprasath06
 
Introduction to Data Mining - A Beginner's Guide
Introduction to Data Mining - A Beginner's GuideIntroduction to Data Mining - A Beginner's Guide
Introduction to Data Mining - A Beginner's Guide
gokulprasath06
 
Reinforcement Learning Guide For Beginners
Reinforcement Learning Guide For BeginnersReinforcement Learning Guide For Beginners
Reinforcement Learning Guide For Beginners
gokulprasath06
 
Artificial Neural Networks (ANN)
Artificial Neural Networks (ANN)Artificial Neural Networks (ANN)
Artificial Neural Networks (ANN)
gokulprasath06
 
OLAP
OLAPOLAP
OLAP
gokulprasath06
 
Data science guide
Data science guideData science guide
Data science guide
gokulprasath06
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
gokulprasath06
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
gokulprasath06
 
Introduction to Data Mining - A Beginner's Guide
Introduction to Data Mining - A Beginner's GuideIntroduction to Data Mining - A Beginner's Guide
Introduction to Data Mining - A Beginner's Guide
gokulprasath06
 
Reinforcement Learning Guide For Beginners
Reinforcement Learning Guide For BeginnersReinforcement Learning Guide For Beginners
Reinforcement Learning Guide For Beginners
gokulprasath06
 
Artificial Neural Networks (ANN)
Artificial Neural Networks (ANN)Artificial Neural Networks (ANN)
Artificial Neural Networks (ANN)
gokulprasath06
 
Ad

Recently uploaded (20)

DATA ANALYST and Techniques in Kochi Explore cutting-edge analytical skills ...
DATA ANALYST  and Techniques in Kochi Explore cutting-edge analytical skills ...DATA ANALYST  and Techniques in Kochi Explore cutting-edge analytical skills ...
DATA ANALYST and Techniques in Kochi Explore cutting-edge analytical skills ...
aacj102006
 
Unit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdfUnit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdf
sixokak391
 
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdfPublication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
StatsCommunications
 
Important JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must KnowImportant JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must Know
yashikanigam1
 
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOTTYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
CA Suvidha Chaplot
 
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiqLesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
AngelPinedaTaguinod
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
The-Future-is-Now-Information-Technology-Trends.pptx.pdf
The-Future-is-Now-Information-Technology-Trends.pptx.pdfThe-Future-is-Now-Information-Technology-Trends.pptx.pdf
The-Future-is-Now-Information-Technology-Trends.pptx.pdf
winnt04
 
report (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhsreport (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhs
AngelPinedaTaguinod
 
presentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptxpresentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptx
GersonVillatoro4
 
Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]
globibo
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
MLOps_with_SageMaker_Template_EN idioma inglés
MLOps_with_SageMaker_Template_EN idioma inglésMLOps_with_SageMaker_Template_EN idioma inglés
MLOps_with_SageMaker_Template_EN idioma inglés
FabianPierrePeaJacob
 
How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?
Process mining Evangelist
 
web-roadmap developer file information..
web-roadmap developer file information..web-roadmap developer file information..
web-roadmap developer file information..
pandeyarush01
 
Time series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptxTime series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptx
AsmaaMahmoud89
 
Lesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdfLesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdf
hemelali11
 
Feature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record SystemsFeature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record Systems
Process mining Evangelist
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
DATA ANALYST and Techniques in Kochi Explore cutting-edge analytical skills ...
DATA ANALYST  and Techniques in Kochi Explore cutting-edge analytical skills ...DATA ANALYST  and Techniques in Kochi Explore cutting-edge analytical skills ...
DATA ANALYST and Techniques in Kochi Explore cutting-edge analytical skills ...
aacj102006
 
Unit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdfUnit 2 - Unified Modeling Language (UML).pdf
Unit 2 - Unified Modeling Language (UML).pdf
sixokak391
 
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdfPublication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
StatsCommunications
 
Important JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must KnowImportant JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must Know
yashikanigam1
 
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOTTYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
CA Suvidha Chaplot
 
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiqLesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
Lesson-2.pptxjsjahajauahahagqiqhwjwjahaiq
AngelPinedaTaguinod
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
The-Future-is-Now-Information-Technology-Trends.pptx.pdf
The-Future-is-Now-Information-Technology-Trends.pptx.pdfThe-Future-is-Now-Information-Technology-Trends.pptx.pdf
The-Future-is-Now-Information-Technology-Trends.pptx.pdf
winnt04
 
report (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhsreport (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhs
AngelPinedaTaguinod
 
presentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptxpresentacion.slideshare.informáticaJuridica..pptx
presentacion.slideshare.informáticaJuridica..pptx
GersonVillatoro4
 
Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]
globibo
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
MLOps_with_SageMaker_Template_EN idioma inglés
MLOps_with_SageMaker_Template_EN idioma inglésMLOps_with_SageMaker_Template_EN idioma inglés
MLOps_with_SageMaker_Template_EN idioma inglés
FabianPierrePeaJacob
 
How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?
Process mining Evangelist
 
web-roadmap developer file information..
web-roadmap developer file information..web-roadmap developer file information..
web-roadmap developer file information..
pandeyarush01
 
Time series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptxTime series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptx
AsmaaMahmoud89
 
Lesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdfLesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdf
hemelali11
 
Feature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record SystemsFeature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record Systems
Process mining Evangelist
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 

K-means Clustering Algorithm with Matlab Source code

  • 1. The K-means Clustering Algorithm 1 K-means is a method of clustering observations into a specific number of disjoint clusters. The ”K” refers to the number of clusters specified. Various distance measures exist to deter- mine which observation is to be appended to which cluster. The algorithm aims at minimiz- ing the measure between the centroide of the cluster and the given observation by iteratively appending an observation to any cluster and terminate when the lowest distance measure is achieved. 1.1 Overview Of Algorithm 1. The sample space is intially partitioned into K clusters and the observations are ran- domly assigned to the clusters. 2. For each sample: • Calculate the distance from the observation to the centroide of the cluster. • IF the sample is closest to its own cluster THEN leave it ELSE select another cluster. 3. Repeat steps 1 and 2 untill no observations are moved from one cluster to another When step 3 terminates the clusters are stable and each sample is assigned a cluster which results in the lowest possible distance to the centriode of the cluster. Common distance measures include the Euclidean distance, the Euclidean squared distance and the Manhattan or City distance. The Euclidean measure corresponds to the shortest geometric distance between to points. d = N ∑ i=1 (xi −yi)2 1 (19.1) 19.2 Distance measures http://bit.ly/2Mub6xP 19.1
  • 2. A faster way of determining the distance is by use of the squared Euclidean distance which calculates the above distance squared, i.e. dsq = N ∑ i=1 (xi −yi)2 The Manhattan measure calculates a distance between points based on a grid and is illus- Euclidean measure Manhattan measure For applications in speech processing the squared Euclidean distance is widely used. K-means can be used to cluster the extracted features from speech signals. The extracted features from the signal include for instance mel frequency cepstral coefficients or line spec- trum pairs. This allows speech signals with similar spectral characteristics to be positioned into the same position in the codebook. In this way similar narrow band signals will be predicted likewise thereby limiting the size of the codebook. The following figures illustrate the K-means algoritm on a 2-dimensional data set. 2 Figure 19.1:Comparision between theEuclideanandtheManhattan measure. CHAPTER 19. THE K-MEANS CLUSTERING ALGORITHM (19.2) trated in Figure 19.1. 19.3 Application of K-means 19.4 Example of K-means Clustering http://bit.ly/2Mub6xP
  • 3. −20 −15 −10 −5 0 5 10 15 20 −20 −15 −10 −5 0 5 10 15 20 −20 −15 −10 −5 0 5 10 15 20 −20 −15 −10 −5 0 5 10 15 20 marked with a cross. 3 Figure 19.2:Exampleof signal datamadefrom GaussianWhiteNoise. Figure 19.3:Thesignal dataareseperated intoseven clusters.Thecentroidsare 19.4. EXAMPLE OF K-MEANS CLUSTERING http://bit.ly/2Mub6xP
  • 4. 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 7 Silhouette Value Cluster into the seven clusters. If the distance from one point to two centroids is the same, it means the point could belong to both centroids. The result is a conflict which gives a negative value in the Silhouette diagram. The positive part of the Silhuoette diagram, shows that there is a clear seperation of the points between the clusters. 4 Figure 19.4:TheSilhouettediagramshowshow well thedataareseperated CHAPTER 19. THE K-MEANS CLUSTERING ALGORITHM http://bit.ly/2Mub6xP
  • 5. 1 close a l l 2 clear a l l 3 clc 4 5 Limit = 2 0 ; 6 7 X = [ 1 0 ∗ randn (400 ,2) ; 1 0 ∗ randn (400 ,2) ] ; 8 plot (X ( : , 1 ) ,X ( : , 2 ) , ’k . ’ ) 9 length (X ( : , 1 ) ) 10 figure 11 %i =1; 12 k=1; 13 for i =1: length (X ( : , 1 ) ) 14 i f ( sqrt (X( i , 1 ) ^2+X( i , 2 ) ^2) ) > Limit ; 15 X( i , 1 ) =0; 16 X( i , 2 ) =0; 17 else 18 Y( k , 1 ) =X( i , 1 ) ; 19 Y( k , 2 ) =X( i , 2 ) ; 20 k=k+1; 21 end 22 end 23 plot (Y ( : , 1 ) ,Y ( : , 2 ) , ’k . ’ ) 24 figure 25 26 [ cidx , c t r s ] = kmeans (Y , 7 , ’ d i s t ’ , ’ sqEuclidean ’ , ’ rep ’ ,5 , ’ disp ’ , ’ f i n a l ’ , ’ EmptyAction ’ , ’ singleton ’ ) ; 27 28 plot (Y( cidx ==1 ,1) ,Y( cidx ==1 ,2) , ’ r . ’ , . . . 29 Y( cidx ==2 ,1) ,Y( cidx ==2 ,2) , ’b . ’ , c t r s ( : , 1 ) , c t r s ( : , 2 ) , ’ kx ’ ) ; 30 31 hold on 32 plot (Y( cidx ==3 ,1) ,Y( cidx ==3 ,2) , ’y . ’ ,Y( cidx ==4 ,1) ,Y( cidx ==4 ,2) , ’g . ’ ) ; 33 34 hold on 35 plot (Y( cidx ==5 ,1) ,Y( cidx ==5 ,2) , ’ c . ’ ,Y( cidx ==6 ,1) ,Y( cidx ==6 ,2) , ’m. ’ ) ; 36 37 hold on 38 plot (Y( cidx ==7 ,1) ,Y( cidx ==7 ,2) , ’k . ’ ) ; 39 40 figure 5 19.5. MATLAB SOURCE CODE 19.5 Matlab Source Code http://bit.ly/2Mub6xP
  • 6. 41 [ silk , h]= s i l h o u e t t e (Y, cidx , ’ sqEuclidean ’ ) ; 42 mean( s i l k ) 6 CHAPTER 19. THE K-MEANS CLUSTERING ALGORITHM Checkout: http://bit.ly/2Mub6xP Data Science Course Content:
  翻译: