SlideShare a Scribd company logo
Data Mining Classification: Alternative Techniques
Rule-Based ClassifierClassify records by using a collection of “if…then…” rulesRule:    (Condition)  ywhere  Condition is a conjunctions of attributes  y is the class labelLHS: rule antecedent or conditionRHS: rule consequent
Characteristics of Rule-Based ClassifierMutually exclusive rulesClassifier contains mutually exclusive rules if the rules are independent of each otherEvery record is covered by at most one ruleExhaustive rulesClassifier has exhaustive coverage if it accounts for every possible combination of attribute valuesEach record is covered by at least one rule
Building Classification RulesDirect Method:  Extract rules directly from data e.g.: RIPPER, CN2, Holte’s 1RIndirect Method: Extract rules from other classification models (e.g.    decision trees, neural networks, etc).e.g: C4.5rules
Direct Method: Sequential CoveringStart from an empty ruleGrow a rule using the Learn-One-Rule functionRemove training records covered by the ruleRepeat Step (2) and (3) until stopping criterion is met
Aspects of Sequential CoveringRule GrowingInstance EliminationRule EvaluationStopping CriterionRule Pruning
Contd…Grow a single ruleRemove Instances from rulePrune the rule (if necessary)Add rule to Current Rule SetRepeat
Indirect Method: C4.5rulesExtract rules from an unpruned decision treeFor each rule, r: A  y, consider an alternative rule r’: A’  y where A’ is obtained by removing one of the conjuncts in ACompare the pessimistic error rate for r against all r’sPrune if one of the r’s has lower pessimistic error rateRepeat until we can no longer improve generalization error
Indirect Method: C4.5rulesInstead of ordering the rules, order subsets of rules (class ordering)Each subset is a collection of rules with the same rule consequent (class)Compute description length of each subset Description length = L(error) + g L(model) g is a parameter that takes into account the presence of redundant attributes in a rule set (default value = 0.5)
Advantages of Rule-Based ClassifiersAs highly expressive as decision treesEasy to interpretEasy to generateCan classify new instances rapidlyPerformance comparable to decision trees
Nearest Neighbor ClassifiersRequires three things
The set of stored records
Distance Metric to compute distance between records
The value of k, the number of nearest neighbors to retrieve
To classify an unknown record:
Compute distance to other training records
Identify k nearest neighbors
Use class labels of nearest neighbors to determine the class label of unknown record (e.g., by taking majority voteDefinition of Nearest NeighborK-nearest neighbors of a record x are data points that have the k smallest distance to x
Nearest Neighbor Classification…Choosing the value of k:If k is too small, sensitive to noise pointsIf k is too large, neighborhood may include points from other classesScaling issuesAttributes may have to be scaled to prevent distance measures from being dominated by one of the attributesExample: height of a person may vary from 1.5m to 1.8m weight of a person may vary from 90lb to 300lb
Nearest neighbor Classification…k-NN classifiers are lazy learners It does not build models explicitlyUnlike eager learners such as decision tree induction and rule-based systemsClassifying unknown records are relatively expensive
Bayes ClassifierA probabilistic framework for solving classification problemsConditional Probability:Bayes theorem:
Example of Bayes TheoremGiven: A doctor knows that meningitis causes stiff neck 50% of the timePrior probability of any patient having meningitis is 1/50,000Prior probability of any patient having stiff neck is 1/20 If a patient has stiff neck, what’s the probability he/she has meningitis?
Naïve Bayes ClassifierAssume independence among attributes Ai when class is given:    P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj)Can estimate P(Ai| Cj) for all Ai and Cj.New point is classified to Cj if  P(Cj)  P(Ai| Cj)  is maximal.
Naïve Bayes ClassifierIf one of the conditional probability is zero, then the entire expression becomes zeroProbability estimation:c: number of classesp: prior probabilitym: parameter
Naïve Bayes (Summary)Robust to isolated noise pointsHandle missing values by ignoring the instance during probability estimate calculationsRobust to irrelevant attributesIndependence assumption may not hold for some attributesUse other techniques such as Bayesian Belief Networks (BBN)
Artificial Neural Networks (ANN)Model is an assembly of inter-connected nodes and weighted linksOutput node sums up each of its input value according to the weights of its linksCompare output node against some threshold t
General Structure of ANNTraining ANN means learning the weights of the neurons
Algorithm for learning ANNInitialize the weights (w0, w1, …, wk)Adjust the weights in such a way that the output of ANN is consistent with class labels of training examplesObjective function:Find the weights wi’s that minimize the above objective function e.g., backpropagation algorithm
Ensemble MethodsConstruct a set of classifiers from the training dataPredict class label of previously unseen records by aggregating predictions made by multiple classifiers
Ad

More Related Content

What's hot (18)

Cis166 final review c#
Cis166 final review c#Cis166 final review c#
Cis166 final review c#
Randy Riness @ South Puget Sound Community College
 
CIS-166 Midterm
CIS-166 MidtermCIS-166 Midterm
CIS-166 Midterm
Randy Riness @ South Puget Sound Community College
 
Ap Power Point Chpt9
Ap Power Point Chpt9Ap Power Point Chpt9
Ap Power Point Chpt9
dplunkett
 
Linear models and multiclass classification
Linear models and multiclass classificationLinear models and multiclass classification
Linear models and multiclass classification
NdSv94
 
List classes
List classesList classes
List classes
Ravi_Kant_Sahu
 
Lecture1
Lecture1Lecture1
Lecture1
Ritu Chaturvedi
 
Vector space classification
Vector space classificationVector space classification
Vector space classification
Ujjawal
 
C++ arrays part1
C++ arrays part1C++ arrays part1
C++ arrays part1
Subhasis Nayak
 
Arrays
ArraysArrays
Arrays
Faisal Aziz
 
Ap Power Point Chpt6
Ap Power Point Chpt6Ap Power Point Chpt6
Ap Power Point Chpt6
dplunkett
 
Finding everything about findings about (fa)
Finding everything about findings about (fa)Finding everything about findings about (fa)
Finding everything about findings about (fa)
Ram Gali
 
Array 2 hina
Array 2 hina Array 2 hina
Array 2 hina
heena94
 
Datastructures and algorithms prepared by M.V.Brehmanada Reddy
Datastructures and algorithms prepared by M.V.Brehmanada ReddyDatastructures and algorithms prepared by M.V.Brehmanada Reddy
Datastructures and algorithms prepared by M.V.Brehmanada Reddy
Malikireddy Bramhananda Reddy
 
M v bramhananda reddy dsa complete notes
M v bramhananda reddy dsa complete notesM v bramhananda reddy dsa complete notes
M v bramhananda reddy dsa complete notes
Malikireddy Bramhananda Reddy
 
Collections (1)
Collections (1)Collections (1)
Collections (1)
abdullah619
 
Chap4java5th
Chap4java5thChap4java5th
Chap4java5th
Asfand Hassan
 
Types of methods in python
Types of methods in pythonTypes of methods in python
Types of methods in python
Aravindreddy Mokireddy
 
264finalppt (1)
264finalppt (1)264finalppt (1)
264finalppt (1)
Mahima Verma
 

Viewers also liked (20)

Ccc
CccCcc
Ccc
msprincess915
 
MED dra Coding -MSSO
MED dra Coding -MSSOMED dra Coding -MSSO
MED dra Coding -MSSO
drabhishekpitti
 
SPSS: Data Editor
SPSS: Data EditorSPSS: Data Editor
SPSS: Data Editor
DataminingTools Inc
 
XL-Miner: Timeseries
XL-Miner: TimeseriesXL-Miner: Timeseries
XL-Miner: Timeseries
DataminingTools Inc
 
MS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesMS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rules
DataminingTools Inc
 
LISP:Loops In Lisp
LISP:Loops In LispLISP:Loops In Lisp
LISP:Loops In Lisp
DataminingTools Inc
 
Mysql:Operators
Mysql:OperatorsMysql:Operators
Mysql:Operators
DataminingTools Inc
 
Data Applied: Association
Data Applied: AssociationData Applied: Association
Data Applied: Association
DataminingTools Inc
 
Asha & Beckis Nc Presentation
Asha & Beckis Nc PresentationAsha & Beckis Nc Presentation
Asha & Beckis Nc Presentation
Asha Stremcha
 
Quick Look At Classification
Quick Look At ClassificationQuick Look At Classification
Quick Look At Classification
DataminingTools Inc
 
Data
DataData
Data
DataminingTools Inc
 
R: Apply Functions
R: Apply FunctionsR: Apply Functions
R: Apply Functions
DataminingTools Inc
 
RapidMiner: Setting Up A Process
RapidMiner: Setting Up A ProcessRapidMiner: Setting Up A Process
RapidMiner: Setting Up A Process
DataminingTools Inc
 
LíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara LozanoLíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara Lozano
lara
 
Control Statements in Matlab
Control Statements in  MatlabControl Statements in  Matlab
Control Statements in Matlab
DataminingTools Inc
 
Data Applied:Decision Trees
Data Applied:Decision TreesData Applied:Decision Trees
Data Applied:Decision Trees
DataminingTools Inc
 
SPSS: Quick Look
SPSS: Quick LookSPSS: Quick Look
SPSS: Quick Look
DataminingTools Inc
 
Épica Latina Latín II
Épica Latina Latín IIÉpica Latina Latín II
Épica Latina Latín II
lara
 
Ad

Similar to Classification Continued (20)

�datamining-lect7.pptx literature of data mining and summary
�datamining-lect7.pptx literature of data mining and summary�datamining-lect7.pptx literature of data mining and summary
�datamining-lect7.pptx literature of data mining and summary
mohammedalbohiry85
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
RevathiSundar4
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
Margaret Wang
 
slides
slidesslides
slides
butest
 
Text categorization
Text categorizationText categorization
Text categorization
Phuong Nguyen
 
Supervised algorithms
Supervised algorithmsSupervised algorithms
Supervised algorithms
Yassine Akhiat
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Salah Amean
 
09 classadvanced
09 classadvanced09 classadvanced
09 classadvanced
JoonyoungJayGwak
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents
hussainahmad77100
 
Chapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.ppt
Subrata Kumer Paul
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
Krish_ver2
 
Basic Clustering Algorithms in Data Warehouisng and Data Miningppt
Basic Clustering Algorithms in Data Warehouisng and Data MiningpptBasic Clustering Algorithms in Data Warehouisng and Data Miningppt
Basic Clustering Algorithms in Data Warehouisng and Data Miningppt
MrSKanthiKiran
 
Classifiers
ClassifiersClassifiers
Classifiers
Ayurdata
 
[ppt]
[ppt][ppt]
[ppt]
butest
 
[ppt]
[ppt][ppt]
[ppt]
butest
 
4_22865_IS465_2019_1__2_1_08ClassBasic.ppt
4_22865_IS465_2019_1__2_1_08ClassBasic.ppt4_22865_IS465_2019_1__2_1_08ClassBasic.ppt
4_22865_IS465_2019_1__2_1_08ClassBasic.ppt
TSANKARARAO
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
midi
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
Houw Liong The
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & Kamber
Houw Liong The
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
Anshika865276
 
�datamining-lect7.pptx literature of data mining and summary
�datamining-lect7.pptx literature of data mining and summary�datamining-lect7.pptx literature of data mining and summary
�datamining-lect7.pptx literature of data mining and summary
mohammedalbohiry85
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
RevathiSundar4
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
Margaret Wang
 
slides
slidesslides
slides
butest
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Salah Amean
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents
hussainahmad77100
 
Chapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.ppt
Subrata Kumer Paul
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
Krish_ver2
 
Basic Clustering Algorithms in Data Warehouisng and Data Miningppt
Basic Clustering Algorithms in Data Warehouisng and Data MiningpptBasic Clustering Algorithms in Data Warehouisng and Data Miningppt
Basic Clustering Algorithms in Data Warehouisng and Data Miningppt
MrSKanthiKiran
 
Classifiers
ClassifiersClassifiers
Classifiers
Ayurdata
 
4_22865_IS465_2019_1__2_1_08ClassBasic.ppt
4_22865_IS465_2019_1__2_1_08ClassBasic.ppt4_22865_IS465_2019_1__2_1_08ClassBasic.ppt
4_22865_IS465_2019_1__2_1_08ClassBasic.ppt
TSANKARARAO
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
midi
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
Houw Liong The
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & Kamber
Houw Liong The
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
Anshika865276
 
Ad

More from DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
DataminingTools Inc
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
DataminingTools Inc
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
DataminingTools Inc
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
DataminingTools Inc
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
DataminingTools Inc
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
DataminingTools Inc
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
DataminingTools Inc
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
DataminingTools Inc
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
DataminingTools Inc
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
DataminingTools Inc
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
DataminingTools Inc
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
DataminingTools Inc
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
DataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
DataminingTools Inc
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
DataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
DataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
DataminingTools Inc
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
DataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
DataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
DataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 

Recently uploaded (20)

Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Developing System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptxDeveloping System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptx
wondimagegndesta
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Developing System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptxDeveloping System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptx
wondimagegndesta
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 

Classification Continued

  • 1. Data Mining Classification: Alternative Techniques
  • 2. Rule-Based ClassifierClassify records by using a collection of “if…then…” rulesRule: (Condition)  ywhere Condition is a conjunctions of attributes y is the class labelLHS: rule antecedent or conditionRHS: rule consequent
  • 3. Characteristics of Rule-Based ClassifierMutually exclusive rulesClassifier contains mutually exclusive rules if the rules are independent of each otherEvery record is covered by at most one ruleExhaustive rulesClassifier has exhaustive coverage if it accounts for every possible combination of attribute valuesEach record is covered by at least one rule
  • 4. Building Classification RulesDirect Method: Extract rules directly from data e.g.: RIPPER, CN2, Holte’s 1RIndirect Method: Extract rules from other classification models (e.g. decision trees, neural networks, etc).e.g: C4.5rules
  • 5. Direct Method: Sequential CoveringStart from an empty ruleGrow a rule using the Learn-One-Rule functionRemove training records covered by the ruleRepeat Step (2) and (3) until stopping criterion is met
  • 6. Aspects of Sequential CoveringRule GrowingInstance EliminationRule EvaluationStopping CriterionRule Pruning
  • 7. Contd…Grow a single ruleRemove Instances from rulePrune the rule (if necessary)Add rule to Current Rule SetRepeat
  • 8. Indirect Method: C4.5rulesExtract rules from an unpruned decision treeFor each rule, r: A  y, consider an alternative rule r’: A’  y where A’ is obtained by removing one of the conjuncts in ACompare the pessimistic error rate for r against all r’sPrune if one of the r’s has lower pessimistic error rateRepeat until we can no longer improve generalization error
  • 9. Indirect Method: C4.5rulesInstead of ordering the rules, order subsets of rules (class ordering)Each subset is a collection of rules with the same rule consequent (class)Compute description length of each subset Description length = L(error) + g L(model) g is a parameter that takes into account the presence of redundant attributes in a rule set (default value = 0.5)
  • 10. Advantages of Rule-Based ClassifiersAs highly expressive as decision treesEasy to interpretEasy to generateCan classify new instances rapidlyPerformance comparable to decision trees
  • 12. The set of stored records
  • 13. Distance Metric to compute distance between records
  • 14. The value of k, the number of nearest neighbors to retrieve
  • 15. To classify an unknown record:
  • 16. Compute distance to other training records
  • 17. Identify k nearest neighbors
  • 18. Use class labels of nearest neighbors to determine the class label of unknown record (e.g., by taking majority voteDefinition of Nearest NeighborK-nearest neighbors of a record x are data points that have the k smallest distance to x
  • 19. Nearest Neighbor Classification…Choosing the value of k:If k is too small, sensitive to noise pointsIf k is too large, neighborhood may include points from other classesScaling issuesAttributes may have to be scaled to prevent distance measures from being dominated by one of the attributesExample: height of a person may vary from 1.5m to 1.8m weight of a person may vary from 90lb to 300lb
  • 20. Nearest neighbor Classification…k-NN classifiers are lazy learners It does not build models explicitlyUnlike eager learners such as decision tree induction and rule-based systemsClassifying unknown records are relatively expensive
  • 21. Bayes ClassifierA probabilistic framework for solving classification problemsConditional Probability:Bayes theorem:
  • 22. Example of Bayes TheoremGiven: A doctor knows that meningitis causes stiff neck 50% of the timePrior probability of any patient having meningitis is 1/50,000Prior probability of any patient having stiff neck is 1/20 If a patient has stiff neck, what’s the probability he/she has meningitis?
  • 23. Naïve Bayes ClassifierAssume independence among attributes Ai when class is given: P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj)Can estimate P(Ai| Cj) for all Ai and Cj.New point is classified to Cj if P(Cj)  P(Ai| Cj) is maximal.
  • 24. Naïve Bayes ClassifierIf one of the conditional probability is zero, then the entire expression becomes zeroProbability estimation:c: number of classesp: prior probabilitym: parameter
  • 25. Naïve Bayes (Summary)Robust to isolated noise pointsHandle missing values by ignoring the instance during probability estimate calculationsRobust to irrelevant attributesIndependence assumption may not hold for some attributesUse other techniques such as Bayesian Belief Networks (BBN)
  • 26. Artificial Neural Networks (ANN)Model is an assembly of inter-connected nodes and weighted linksOutput node sums up each of its input value according to the weights of its linksCompare output node against some threshold t
  • 27. General Structure of ANNTraining ANN means learning the weights of the neurons
  • 28. Algorithm for learning ANNInitialize the weights (w0, w1, …, wk)Adjust the weights in such a way that the output of ANN is consistent with class labels of training examplesObjective function:Find the weights wi’s that minimize the above objective function e.g., backpropagation algorithm
  • 29. Ensemble MethodsConstruct a set of classifiers from the training dataPredict class label of previously unseen records by aggregating predictions made by multiple classifiers
  • 31. Why does it work?Suppose there are 25 base classifiersEach classifier has error rate,  = 0.35Assume classifiers are independentProbability that the ensemble classifier makes a wrong prediction:
  • 32. Examples of Ensemble MethodsHow to generate an ensemble of classifiers?BaggingBoosting
  • 33. BaggingSampling with replacementBuild classifier on each bootstrap sampleEach sample has probability (1 – 1/n)n of being selected
  • 34. BoostingAn iterative procedure to adaptively change distribution of training data by focusing more on previously misclassified recordsInitially, all N records are assigned equal weightsUnlike bagging, weights may change at the end of boosting round
  • 35. Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net
  翻译: