SlideShare a Scribd company logo
K-MEANS
CLUSTERING
METHOD BASED
NETWORK SHARED
RESOURCES MINING
A SHORT STORY PRESENTED BY
KANCHETI SAI PRAGNA
SJSU_ID: 016698552
WHY MINING NETWORK
SHARED RESOURCES?
 The demand for data resource
sharing in internet has been
growing and this brought up
many optimization techniques in
utilizing efficiency of resources.
 At present, there are at least 15
Trillion files available on the
internet, The vast availability of
resources makes a complex task in
retrieving the relevant data
resources efficiently
 In order to solve problems of large
redundant information and
relevant data resources research
the need for data mining in
network shared data resources
arose.
Existing
Methods of
network
shared
resources
mining
• There has been a significant research done in data mining methods in relevant
data resources research and various techniques came into picture.
• clustering analysis algorithm based Method where it uses clustering analysis
algorithm to process resource data, construct the data preprocessing set, and
calculate the data feature vector.
• Another method based on multi-dimensional resource coordination and
aggregation where this technique focuses on using the data center's network
resource sharing process analysis as the basis for building a multidimensional
resource aggregation data model.
• using fuzzy logic to build multidimensional collaborative fitness functions, and
using data mining to optimize decision-making in order to increase the execution
efficiency of the data mining process.
• However, Although these methods produced some excellent results they lack in
run time efficiency, precision and they are usually complex to apply practically.
• In order to overcome above drawbacks a new method based on k means
clustering algorithm has come into picture.
CLUSTERING
WHAT IS
CLUSTERING?
 Clustering is used in assembling
bulky data into clusters or
groups that helps us to visualize
the internal structure of the
data. Basically, it is a grouping
of items based on how similar
and distinct they are to one
another
 For example, there is some
online shopping site where we
can find variety of stuffs from
electronics, clothing, books,
grocery items, cosmetic items,
accessories. Here in figure 2
describes how it looks after
clustering is done.
STAGES OF
CLUSTERING
 Raw Data
 Clustering Algorithm
 Clusters
STAGES OF CLUSTERING
 Raw Data: Raw data (which are not being processed yet) are collected from various sources on which we
want to solicit various clustering algorithm
 Clustering Algorithm: A specific algorithm is selected according to our requirements and then that very
algorithm is applied on the raw data that were being selected.
 Clusters: After soliciting the selected clustering algorithm on the raw data, we acquire our clusters.
TYPES OF
CLUSTERING
 Partitioning Method
 Density-based Method
 Hierarchical Method
 Grid-based method
 Model-based clustering method
 Constraint-based method
PARTITIONING METHOD
 In the case of partitioning clustering method,
the objects of the datasets are segregated into
numerous subsets.
 Given some examples of the partitioning
algorithms are K-means, PAM (Partitioning
AroundMedoids).
 The figure shows how clusters are formed after
applying partitioning clustering technique
DENSITY-BASED METHOD
 Density-Based Clustering method identify
distinctive clusters in the data, based on the
idea that a cluster/group in a data space is a
contiguous region of high point density,
separated from other clusters by sparse
regions.
 Basically, in this method clusters are formed or
the data spaces are partitioned by the density
of the data point in a particular region
 The figure shows how clusters are formed after
applying Density-Based Method of clustering
HIERARCHICAL METHOD
 In the case of hierarchical clustering method,
the objects of the datasets are segregated in
the hierarchical fashion of clusters or groups.
 Examples: Agglomerative Hierarchical
clustering algorithm (AGNES), Divisive
Hierarchical clustering algorithm (DIANA) etc.,
 The figure shows how clusters are formed after
applying Hierarchical Method of clustering
GRID-BASED METHOD
 In grid-based clustering method, the object
space is divided into fixed number of cells that
forms the shape of a grid like structure.
Clustering algorithm is STING (Statistical
Information Grid).
 The figure shows how clusters are formed after
applying grid-based clustering methodrid-
based method
MODEL-BASED CLUSTERING METHOD
 Model-based clustering works on the concept
of Probability Model which is a mathematical
representation of any random occurrence of
dataset. Each of the groups that would form
will have different Probability Model.
 The figure shows how clusters are formed after
applying Model-based clustering method
CONSTRAINT-BASED METHOD
 Constrained-based clustering method is a
semi-supervised learning technique where
amalgamation of small proportion of labeled
data with a large proportion of unlabeled data
occurs.
 Constrained K-means (COP-K-Means)
algorithm is one of the common algorithms
using this method
 The figure illustrates clustering using
Constraint-based method.
K-MEANS
CLUSTERING
K-MEANS CLUSTERING ALGORITHM
 The K-Means algorithm is a sort of partition-based clustering approach that belongs to the unsupervised
learning techniques. It divides a huge set of data into K number of smaller groups. The two distinct steps
of this method are described below.
 a. First phase: K centroids or centers are selected haphazardly in this phase. K should have a permanent
value. During the procedure, it cannot be changed.
 b. Second phase: Each data point is given its closest center or centroids during this phase. Euclidean
distance is used to calculate the separation between cluster centroids or centers and all data points.
 The distance between any two points, let's say point x and point y, is known as the Euclidean distance.
The separation between x and y is equal to the separation between x and y. Equation (1) states the
following for the Euclidean distance between any two randomly chosen points, x and y:
K-MEANS CLUSTERING ALGORITHM
 Algorithm for K-Means
 1. Input: Choose a database and select the value of K that is the number of clusters we want at the
end.Let
 the database be D with n number of data objects. D = {d1, d2, d3, ….,dn}
 2. Output: We will obtain an arrangement of K number of clusters.
 3. Algorithm
 (i) Randomly select the number of clusters, K.
 (ii) Choose the centre or the centroids for K clusters. The initial values of the centres are selected
 arbitrarily.
K-MEANS CLUSTERING ALGORITHM
 (iii) Arrange all data objects to the closest cluster; this is
determined with the help of Euclidean distance
 theory.
 (iv) Again calculate the centre of the cluster. This is evaluated by
taking the mean of the data objects
 present in each of the cluster individually. If there are n objects say
x1, x2, x3, …., and then the mean is
 given in equation (2)
 (v) Repeat step (iii) and (iv) until convergence. This is basically an
iterative technique
APPLICATION OF K-MEANS CLUSTERING ALGORITHM IN
MINING OF NETWORK SHARED RESOURCES
K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES
 The K-means algorithm has emerged as the most well-known and
widely used algorithm in the process of data collecting due to its
advantages of high data processing efficiency, low computational
complexity, and strong scalability.
 The data of Network shared resources is clustered in to different
classes using k-means clustering in the manner shown in the
image.
K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES
 When compared to existing methods that are mentioned above the K-means clustering algorithm has
the following advantages:
 The K-means clustering technique has a significant robustness when managing data sets. In particular,
when using the algorithm to handle the class and the class has a large gap between the data set, the
classification results are improved.
 The input order of data objects has almost no impact on the classification outcomes when numerical
data sets are classified using the K-means clustering algorithm.
K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES
 The reason is that in order to achieve the classification of the data set, the distance formula is applied to
determine the distance from each data object to the center point during the clustering process using
this technique.
 Which was not in the case of above mentioned methods where the outcomes of classification division
are hugely impacted buy the order of input objects.
 This algorithm is capable of handling big data sets. The outcomes of data clustering won't be affected if
there is data overlap between different data sets, hence this approach has good practical use.
COMPARISONS WITH EXISTING METHODS
ACCURACY
COMPARISON
 The accuracy of k-means
based method is almost
close to 97% while the other
methods could not be more
than 80% as the number of
experiments increases.
DATA MINING TIME
COMPARISON
 The average time for data
mining using K-means
clustering based method is
only 0.6s. whereas, the
average time for other
methods are almost 4.2 and
2.9 seconds.
CONCLUSION
 in order to improve the quality of network shared
resource data mining, the K-means cluster network
data mining technique has accuracy of in-depth data
mining of network shared resources by the method is
always over 94%, and the average time of in-depth
data mining is only 0.6s,.
 suggesting that this method can achieve fast and
accurate in-depth data mining of network shared
resources.
 Yet, there are still a number of challenges including
the deep mining of language and cross-cultural
resource sharing as well as the security,
personalization, and intelligence of resource data
mining to resolve.
THANK YOU
Ad

More Related Content

Similar to K- means clustering method based Data Mining of Network Shared Resources .pptx (20)

A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
Alexander Decker
 
F04463437
F04463437F04463437
F04463437
IOSR-JEN
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
Natasha Grant
 
47 292-298
47 292-29847 292-298
47 292-298
idescitation
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...
IRJET Journal
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes Clustering
IJERD Editor
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
IRJET Journal
 
Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
IJRAT
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
IOSR Journals
 
Ir3116271633
Ir3116271633Ir3116271633
Ir3116271633
IJERA Editor
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
PRAWEEN KUMAR
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...
Alexander Decker
 
G0354451
G0354451G0354451
G0354451
iosrjournals
 
Comparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data AnalysisComparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data Analysis
IOSR Journals
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
 
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
KamleshKumar394
 
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
IJECEIAES
 
Assessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data LinkagesAssessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data Linkages
journal ijrtem
 
Enhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online DataEnhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online Data
IOSR Journals
 
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
Alexander Decker
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
Natasha Grant
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...
IRJET Journal
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes Clustering
IJERD Editor
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
IRJET Journal
 
Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
IJRAT
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
IOSR Journals
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
PRAWEEN KUMAR
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...
Alexander Decker
 
Comparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data AnalysisComparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data Analysis
IOSR Journals
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
 
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
KamleshKumar394
 
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
IJECEIAES
 
Assessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data LinkagesAssessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data Linkages
journal ijrtem
 
Enhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online DataEnhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online Data
IOSR Journals
 

Recently uploaded (20)

Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docxAnalysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
hershtara1
 
real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682
way to join real illuminati Agent In Kampala Call/WhatsApp+256782561496/0756664682
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
Transforming health care with ai powered
Transforming health care with ai poweredTransforming health care with ai powered
Transforming health care with ai powered
gowthamarvj
 
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Jayantilal Bhanushali
 
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
Taqyea
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
Process Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital TransformationsProcess Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital Transformations
Process mining Evangelist
 
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdfPublication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
StatsCommunications
 
Understanding Complex Development Processes
Understanding Complex Development ProcessesUnderstanding Complex Development Processes
Understanding Complex Development Processes
Process mining Evangelist
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Ann Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdfAnn Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdf
আন্ নাসের নাবিল
 
Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]
globibo
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
Time series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdfTime series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdf
asmaamahmoudsaeed
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docxAnalysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
hershtara1
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
Transforming health care with ai powered
Transforming health care with ai poweredTransforming health care with ai powered
Transforming health care with ai powered
gowthamarvj
 
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Jayantilal Bhanushali
 
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
Taqyea
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
Process Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital TransformationsProcess Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital Transformations
Process mining Evangelist
 
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdfPublication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
StatsCommunications
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]Language Learning App Data Research by Globibo [2025]
Language Learning App Data Research by Globibo [2025]
globibo
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
Time series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdfTime series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdf
asmaamahmoudsaeed
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
Ad

K- means clustering method based Data Mining of Network Shared Resources .pptx

  • 1. K-MEANS CLUSTERING METHOD BASED NETWORK SHARED RESOURCES MINING A SHORT STORY PRESENTED BY KANCHETI SAI PRAGNA SJSU_ID: 016698552
  • 2. WHY MINING NETWORK SHARED RESOURCES?  The demand for data resource sharing in internet has been growing and this brought up many optimization techniques in utilizing efficiency of resources.  At present, there are at least 15 Trillion files available on the internet, The vast availability of resources makes a complex task in retrieving the relevant data resources efficiently  In order to solve problems of large redundant information and relevant data resources research the need for data mining in network shared data resources arose.
  • 3. Existing Methods of network shared resources mining • There has been a significant research done in data mining methods in relevant data resources research and various techniques came into picture. • clustering analysis algorithm based Method where it uses clustering analysis algorithm to process resource data, construct the data preprocessing set, and calculate the data feature vector. • Another method based on multi-dimensional resource coordination and aggregation where this technique focuses on using the data center's network resource sharing process analysis as the basis for building a multidimensional resource aggregation data model. • using fuzzy logic to build multidimensional collaborative fitness functions, and using data mining to optimize decision-making in order to increase the execution efficiency of the data mining process. • However, Although these methods produced some excellent results they lack in run time efficiency, precision and they are usually complex to apply practically. • In order to overcome above drawbacks a new method based on k means clustering algorithm has come into picture.
  • 5. WHAT IS CLUSTERING?  Clustering is used in assembling bulky data into clusters or groups that helps us to visualize the internal structure of the data. Basically, it is a grouping of items based on how similar and distinct they are to one another  For example, there is some online shopping site where we can find variety of stuffs from electronics, clothing, books, grocery items, cosmetic items, accessories. Here in figure 2 describes how it looks after clustering is done.
  • 6. STAGES OF CLUSTERING  Raw Data  Clustering Algorithm  Clusters
  • 7. STAGES OF CLUSTERING  Raw Data: Raw data (which are not being processed yet) are collected from various sources on which we want to solicit various clustering algorithm  Clustering Algorithm: A specific algorithm is selected according to our requirements and then that very algorithm is applied on the raw data that were being selected.  Clusters: After soliciting the selected clustering algorithm on the raw data, we acquire our clusters.
  • 8. TYPES OF CLUSTERING  Partitioning Method  Density-based Method  Hierarchical Method  Grid-based method  Model-based clustering method  Constraint-based method
  • 9. PARTITIONING METHOD  In the case of partitioning clustering method, the objects of the datasets are segregated into numerous subsets.  Given some examples of the partitioning algorithms are K-means, PAM (Partitioning AroundMedoids).  The figure shows how clusters are formed after applying partitioning clustering technique
  • 10. DENSITY-BASED METHOD  Density-Based Clustering method identify distinctive clusters in the data, based on the idea that a cluster/group in a data space is a contiguous region of high point density, separated from other clusters by sparse regions.  Basically, in this method clusters are formed or the data spaces are partitioned by the density of the data point in a particular region  The figure shows how clusters are formed after applying Density-Based Method of clustering
  • 11. HIERARCHICAL METHOD  In the case of hierarchical clustering method, the objects of the datasets are segregated in the hierarchical fashion of clusters or groups.  Examples: Agglomerative Hierarchical clustering algorithm (AGNES), Divisive Hierarchical clustering algorithm (DIANA) etc.,  The figure shows how clusters are formed after applying Hierarchical Method of clustering
  • 12. GRID-BASED METHOD  In grid-based clustering method, the object space is divided into fixed number of cells that forms the shape of a grid like structure. Clustering algorithm is STING (Statistical Information Grid).  The figure shows how clusters are formed after applying grid-based clustering methodrid- based method
  • 13. MODEL-BASED CLUSTERING METHOD  Model-based clustering works on the concept of Probability Model which is a mathematical representation of any random occurrence of dataset. Each of the groups that would form will have different Probability Model.  The figure shows how clusters are formed after applying Model-based clustering method
  • 14. CONSTRAINT-BASED METHOD  Constrained-based clustering method is a semi-supervised learning technique where amalgamation of small proportion of labeled data with a large proportion of unlabeled data occurs.  Constrained K-means (COP-K-Means) algorithm is one of the common algorithms using this method  The figure illustrates clustering using Constraint-based method.
  • 16. K-MEANS CLUSTERING ALGORITHM  The K-Means algorithm is a sort of partition-based clustering approach that belongs to the unsupervised learning techniques. It divides a huge set of data into K number of smaller groups. The two distinct steps of this method are described below.  a. First phase: K centroids or centers are selected haphazardly in this phase. K should have a permanent value. During the procedure, it cannot be changed.  b. Second phase: Each data point is given its closest center or centroids during this phase. Euclidean distance is used to calculate the separation between cluster centroids or centers and all data points.  The distance between any two points, let's say point x and point y, is known as the Euclidean distance. The separation between x and y is equal to the separation between x and y. Equation (1) states the following for the Euclidean distance between any two randomly chosen points, x and y:
  • 17. K-MEANS CLUSTERING ALGORITHM  Algorithm for K-Means  1. Input: Choose a database and select the value of K that is the number of clusters we want at the end.Let  the database be D with n number of data objects. D = {d1, d2, d3, ….,dn}  2. Output: We will obtain an arrangement of K number of clusters.  3. Algorithm  (i) Randomly select the number of clusters, K.  (ii) Choose the centre or the centroids for K clusters. The initial values of the centres are selected  arbitrarily.
  • 18. K-MEANS CLUSTERING ALGORITHM  (iii) Arrange all data objects to the closest cluster; this is determined with the help of Euclidean distance  theory.  (iv) Again calculate the centre of the cluster. This is evaluated by taking the mean of the data objects  present in each of the cluster individually. If there are n objects say x1, x2, x3, …., and then the mean is  given in equation (2)  (v) Repeat step (iii) and (iv) until convergence. This is basically an iterative technique
  • 19. APPLICATION OF K-MEANS CLUSTERING ALGORITHM IN MINING OF NETWORK SHARED RESOURCES
  • 20. K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES  The K-means algorithm has emerged as the most well-known and widely used algorithm in the process of data collecting due to its advantages of high data processing efficiency, low computational complexity, and strong scalability.  The data of Network shared resources is clustered in to different classes using k-means clustering in the manner shown in the image.
  • 21. K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES  When compared to existing methods that are mentioned above the K-means clustering algorithm has the following advantages:  The K-means clustering technique has a significant robustness when managing data sets. In particular, when using the algorithm to handle the class and the class has a large gap between the data set, the classification results are improved.  The input order of data objects has almost no impact on the classification outcomes when numerical data sets are classified using the K-means clustering algorithm.
  • 22. K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES  The reason is that in order to achieve the classification of the data set, the distance formula is applied to determine the distance from each data object to the center point during the clustering process using this technique.  Which was not in the case of above mentioned methods where the outcomes of classification division are hugely impacted buy the order of input objects.  This algorithm is capable of handling big data sets. The outcomes of data clustering won't be affected if there is data overlap between different data sets, hence this approach has good practical use.
  • 24. ACCURACY COMPARISON  The accuracy of k-means based method is almost close to 97% while the other methods could not be more than 80% as the number of experiments increases.
  • 25. DATA MINING TIME COMPARISON  The average time for data mining using K-means clustering based method is only 0.6s. whereas, the average time for other methods are almost 4.2 and 2.9 seconds.
  • 26. CONCLUSION  in order to improve the quality of network shared resource data mining, the K-means cluster network data mining technique has accuracy of in-depth data mining of network shared resources by the method is always over 94%, and the average time of in-depth data mining is only 0.6s,.  suggesting that this method can achieve fast and accurate in-depth data mining of network shared resources.  Yet, there are still a number of challenges including the deep mining of language and cross-cultural resource sharing as well as the security, personalization, and intelligence of resource data mining to resolve.
  翻译: