SlideShare a Scribd company logo
Chapter 8 Covering (Rules-based) Algorithm Data Mining Technology
Chapter 8 Covering (Rules-based) Algorithm Written by Shakhina Pulatova  Presented by Zhao Xinyou [email_address] 2007.11.13 Data Mining Technology Some materials (Examples) are taken from Website.
Contents What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application
Introduction-App-1 PP87-88 Training Data Attributes Record Rules Rules given by people Rules generated by computer Setting 1.(1.75, 0)  short 2. [1.75, 1.95) Medium 3. [1.95, ..) tall
Introduction-App-2 PP87-88 How to get all tall people from B based on A A B + Training Data
What is Rule-based Algorithm? Definition : Each classification method uses an algorithm to generate rules from the sample data. These rules are then applied to new data. Rule-based algorithm  provide mechanisms that generate rules by  1. concentrating on a specific class at a time 2. maximizing the probability of the desired classification. PP87-88 Should be compact, easy-to-interpret, and accurate.
Classification Rules- Straightforward If-Then rule Generating rules from Decision Tree PP88-89
formal Specification of Rule-based Algorithm The classification  r ules, r=<a, c>, consists of : a  ( a ntecedent/precondition): a series of tests that be valuated as  true  or  false ; c  ( c onsequent/conclusion): the class or classes that apply to instances covered by rule r. PP88 a=0,b=0 a=0,b=1 a=1,b=0 a=1,b=1 a = x y c = a=0 b=0 b=0 yes no X X Y Y no no yes yes
Remarks of Straightforward classification The  a ntecedent contains a predicate that can be valuated as true or false against each tuple in database. These rules relate directly to corresponding decision tree (DT) that could be created. A DT can always be used to generate rules, but they are not equivalent. Differences: -the tree has a implied order in which the splitting is performed; rules have no order. -a tree is created based on looking at all classes; only one class must be examined at a time. PP88-89
If-Then rule Straightforward way to perform classification is to generate if-then rules that cover all cases. 1 PP88
Generating rules from Decision Tree -1-Con’ Decision Tree 2
Generating rules from Decision Tree -2-Con’ y n a b c d x y y
Generating rules from Decision Tree -3-Con’
Remarks Rules may be more complex and incomprehensible from DT. A new test or rules need reshaping the whole tree Rules obtained without decision trees are more compact and accurate. So many other covering algorithms have been proposed. PP89-90 a b x y y c d x y y n n n n c d x y y n n c d x y y n n c d x y y n n duplicate subtrees a=0 b=0 b=0 yes no X X Y Y no no yes yes a=1 and c=0  Y
Rule-based Classification Generate rules The 1R Algorithm / Learn One Rule The PRISM Algorithm Other Algorithm PP90
Generating rules without Decision Trees-1-con’ Goal: find rules that identify the instances of a specific class Generate the “best” rule possible by optimizing the desired classification probability Usually, the “best” attribute-pair is chosen Remark -these technologies are also called covering algorithms because they attempt to generate rules which exactly  cover  a specific class.
Generate Rules-Example-2-Con' Example 3 Question: We want to generate a rule to classify persons as tall. Basic format of the rule: if ? then class = tall Goal: replace “?” with predicates that can be used to obtain the “best” probability of being tall PP90
Generate Rules-Algorithms-3-Con' 1.Generate rule R on training data S; 2.Remove the training data covered by rule R; 3. Repeat the process. PP90
Generate Rules-Example-4-Con' Sequential Covering (I) Original data (ii) Step 1 r = NULL (iii) Step 2 R1 r = R1 (iii) Step 3 R1 R2 r = R1  U R2 (iii) Step 4 R1 R2 R3 r = R1  U R2  U R3 Wrong Class
1R Algorithm/ Learn One Rule-Con’  Simple and cheap method it only generates a one level decision tree. Classify an object on the basis of a single attribute. Idea: Rules will be constructed to test a single attribute and branch for every value of that attribute. For each branch, the class with the test classification is the one occurring  PP91
1R Algorithm/ Learn One Rule-Con’  Idea : 1. Rules will be constructed to test a single attribute and branch for every value of that attribute.  Step   2. For each branch, the class with the test classification is the one occurring. 3. Find one biggest number as rules 4. Error rate will be evaluated. 5. The minimum error rate will be chosen.  PP91 M->T  Error=5 F->M  Error=3 Total  Error=8 Total  Error=3 Total  Error=.. A2 An Gender F 2 5 1 S M T M 1 4 10 S M T
1R Algorithm Input: D   //Training Data T   //Attributes to consider for rules   C   //Classes Output : R   //Rules ALgorithm : R=Φ; for all A in T do R A =Φ; for all possbile value, v, of A do for all C j ∈C do find count(C j ) end for let C m  be the class with the largest count; R A =R A ((A=v) ->(class= C m )); end for ERR A =number of tuples incorrectly classified by R A ; e nd for R=R A  where ERR A  is minimum T={Gender, Height} D C={{F, M},  {0, ∞}} C1 C2 Training Data Gender F M Short Medium Tall 3 6 0 Short Medium Tall 1 2 3 R1=F->medium R2=M->tall Height
Example 5 – 1R-3-Con’ Rules  based on  height … ... … 0/2 0/2 0/3 0/4 1/2 0/2 3/9 3/6 Error 1/15 (0  , 1.6]-> short (1.6, 1.7]->short (1.7, 1.8]-> medium (1.8, 1.9]-> medium (1.9, 2.0]-> medium (2.0,  ∞ ]-> tall Height (Step=0.1) 2 6/15 F->medium M->tall Gender 1 Total Error Rules Attribute Option
Example 6 -1R PP92-93 5/14 2/8 3/6 False->yes True->no windy 4 4/14 3/7 1/7 High->no Normal->yes humidity 3 2/4 2/6 1/4 2/5 0/4 2/5 Error 5/14 Hot->no Mild->yes Cool->yes temperature 2 4/14 Sunny->no Overcast->yes Rainy->yes outlook 1 Total Error Rules Attribute Rules  based on humidity  OR High->no Normal->yes Rules  based on outlook Sunny->no Overcast->yes Rainy->yes
PRISM Algorithm-Con’ PRISM generate rules for each class by looking at the training data and adding rules that completely describe all tuples in that class. Generates only correct or perfect rules: the accuracy of so-constructed PRISM is 100%. Measures the success of a rule by a p/t, where  -p is number of positive instance,  -T is total number of instance covered by the rule. Gender=Male  P=10, T=10 Gender=Female  P=1 T=8  R=Gender = Male …… A2 An Gender F 2 5 1 S M T M 0 0 10 S M T
PRISM Algorithm Step Input  D  and  C  (Attribute -> Value) 1.Compute all class P/T  (Attribute->Value) 2. Find one or more pair of  (Attribute->Value)   P/T = 100% 3. Select  (Attribute->Value)  as  Rule 4. Repeat 1-3 until no data in  D Input: D   //Training Data C   //Classes Output: R //Rules
Example 8-Con’-which class may be tall? Compute the value  p / t Which one is 100% PP94-95 0/9 Gender = F 1 2/2 2.0< Height 8 ½ 1.9< Height  ≤ 2.0 7 0/4 1.8< Height  ≤ 1.9 6 0/3 1.7< Height  ≤ 1.8 5 0/2 1.6< Height  ≤ 1.7 4 0/2 Height  ≤ 1.6 3 3/6 Gender = M 2 p / t (Attribute, value) Num R1  = 2.0< Height
R2  = 1.95< Height ≤ 2.0 R = R1 U R2 PP94-96 … … … 1/1 1.95< Height  ≤ 2.0 0/1 1.9< Height  ≤ 1.95 p / t (Attribute, value) Num
Example 9-Con’-which days may play? The predicate  outlook=overcast   correctly implies  play=yes  on all four rows R1 =if outlook=overcast, then play=yes Compute the value  p / t
Example 8-Con’ R2= if humidity=normal and windy=false, then play=yes
Example 8-Con’ R3 =….. R = R1 U R2 U R3 U…
Application of Covering Algorithm To derive classification rules applied for diagnosing illness, business planning, banking, government. Machine learning Text classification. But to photos, it is difficult… And so on.
Application on E-learning/M-learning Adaptive and personalized learning materials Virtual Group Classification Initial Learner’s information Classification of learning styles or some Provide adaptive and personalized materials Collect learning styles feedback Chapter 2 or 3 Similarity, Bayesian… Rule-based algorithm
Discussion
Ad

More Related Content

What's hot (20)

States, state graphs and transition testing
States, state graphs and transition testingStates, state graphs and transition testing
States, state graphs and transition testing
ABHISHEK KUMAR
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler design
Anul Chaudhary
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
Vignesh Saravanan
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
Krish_ver2
 
Rule based system
Rule based systemRule based system
Rule based system
Dr. C.V. Suresh Babu
 
Machine learning Lecture 2
Machine learning Lecture 2Machine learning Lecture 2
Machine learning Lecture 2
Srinivasan R
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
vikas dhakane
 
Concurrency control
Concurrency controlConcurrency control
Concurrency control
Soumyajit Dutta
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 
Concurrency Control in Database Management System
Concurrency Control in Database Management SystemConcurrency Control in Database Management System
Concurrency Control in Database Management System
Janki Shah
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretization
Krish_ver2
 
Association rule mining.pptx
Association rule mining.pptxAssociation rule mining.pptx
Association rule mining.pptx
maha797959
 
Syntax Directed Definition and its applications
Syntax Directed Definition and its applicationsSyntax Directed Definition and its applications
Syntax Directed Definition and its applications
ShivanandManjaragi2
 
States, state graphs and transition testing
States, state graphs and transition testingStates, state graphs and transition testing
States, state graphs and transition testing
geethawilliam
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
Text Classification
Text ClassificationText Classification
Text Classification
RAX Automation Suite
 
Problems, Problem spaces and Search
Problems, Problem spaces and SearchProblems, Problem spaces and Search
Problems, Problem spaces and Search
BMS Institute of Technology and Management
 
Coda file system
Coda file systemCoda file system
Coda file system
Sneh Pahilwani
 
Mycin
MycinMycin
Mycin
vini89
 
Machine Learning project presentation
Machine Learning project presentationMachine Learning project presentation
Machine Learning project presentation
Ramandeep Kaur Bagri
 
States, state graphs and transition testing
States, state graphs and transition testingStates, state graphs and transition testing
States, state graphs and transition testing
ABHISHEK KUMAR
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler design
Anul Chaudhary
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
Krish_ver2
 
Machine learning Lecture 2
Machine learning Lecture 2Machine learning Lecture 2
Machine learning Lecture 2
Srinivasan R
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
vikas dhakane
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 
Concurrency Control in Database Management System
Concurrency Control in Database Management SystemConcurrency Control in Database Management System
Concurrency Control in Database Management System
Janki Shah
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretization
Krish_ver2
 
Association rule mining.pptx
Association rule mining.pptxAssociation rule mining.pptx
Association rule mining.pptx
maha797959
 
Syntax Directed Definition and its applications
Syntax Directed Definition and its applicationsSyntax Directed Definition and its applications
Syntax Directed Definition and its applications
ShivanandManjaragi2
 
States, state graphs and transition testing
States, state graphs and transition testingStates, state graphs and transition testing
States, state graphs and transition testing
geethawilliam
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
Machine Learning project presentation
Machine Learning project presentationMachine Learning project presentation
Machine Learning project presentation
Ramandeep Kaur Bagri
 

Viewers also liked (20)

2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
Krish_ver2
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification Rules
Pier Luca Lanzi
 
rule-based classifier
rule-based classifierrule-based classifier
rule-based classifier
Sean Chiu
 
DATA MINING on WEKA
DATA MINING on WEKADATA MINING on WEKA
DATA MINING on WEKA
satyamkhatri
 
Data mining
Data miningData mining
Data mining
Monsur Ahmed Shafiq
 
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Tomer Gabel
 
C++ TUTORIAL 8
C++ TUTORIAL 8C++ TUTORIAL 8
C++ TUTORIAL 8
Farhan Ab Rahman
 
Ch5 alternative classification
Ch5 alternative classificationCh5 alternative classification
Ch5 alternative classification
dadaoxing
 
Randomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Randomized Algorithms in Linear Algebra & the Column Subset Selection ProblemRandomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Randomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Wei Xue
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
Yıldırım Tam
 
Chap08alg
Chap08algChap08alg
Chap08alg
Munkhchimeg
 
Solving The Shortest Path Tour Problem
Solving The Shortest Path Tour ProblemSolving The Shortest Path Tour Problem
Solving The Shortest Path Tour Problem
Nozir Shokirov
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
Datamining Tools
 
Data mining notes
Data mining notesData mining notes
Data mining notes
AVC College of Engineering
 
21 backtracking
21 backtracking21 backtracking
21 backtracking
Aparup Behera
 
DP
DPDP
DP
Subba Oota
 
5.5 back track
5.5 back track5.5 back track
5.5 back track
Krish_ver2
 
Subset sum problem Dynamic and Brute Force Approch
Subset sum problem Dynamic and Brute Force ApprochSubset sum problem Dynamic and Brute Force Approch
Subset sum problem Dynamic and Brute Force Approch
Ijlal Ijlal
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm Analysis
Rajendran
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
Krish_ver2
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification Rules
Pier Luca Lanzi
 
rule-based classifier
rule-based classifierrule-based classifier
rule-based classifier
Sean Chiu
 
DATA MINING on WEKA
DATA MINING on WEKADATA MINING on WEKA
DATA MINING on WEKA
satyamkhatri
 
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Tomer Gabel
 
Ch5 alternative classification
Ch5 alternative classificationCh5 alternative classification
Ch5 alternative classification
dadaoxing
 
Randomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Randomized Algorithms in Linear Algebra & the Column Subset Selection ProblemRandomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Randomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Wei Xue
 
Solving The Shortest Path Tour Problem
Solving The Shortest Path Tour ProblemSolving The Shortest Path Tour Problem
Solving The Shortest Path Tour Problem
Nozir Shokirov
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
Datamining Tools
 
5.5 back track
5.5 back track5.5 back track
5.5 back track
Krish_ver2
 
Subset sum problem Dynamic and Brute Force Approch
Subset sum problem Dynamic and Brute Force ApprochSubset sum problem Dynamic and Brute Force Approch
Subset sum problem Dynamic and Brute Force Approch
Ijlal Ijlal
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm Analysis
Rajendran
 
Ad

Similar to Covering (Rules-based) Algorithm (20)

DATA STRUCTURE.pdf
DATA STRUCTURE.pdfDATA STRUCTURE.pdf
DATA STRUCTURE.pdf
ibrahim386946
 
DATA STRUCTURE
DATA STRUCTUREDATA STRUCTURE
DATA STRUCTURE
RobinRohit2
 
Design and analysis of algorithm in Computer Science
Design and analysis of algorithm in Computer ScienceDesign and analysis of algorithm in Computer Science
Design and analysis of algorithm in Computer Science
secularistpartyofind
 
Predictive analytics using 'R' Programming
Predictive analytics using 'R' ProgrammingPredictive analytics using 'R' Programming
Predictive analytics using 'R' Programming
ssusere796b3
 
White boxvsblackbox
White boxvsblackboxWhite boxvsblackbox
White boxvsblackbox
sanerjjd
 
Daa unit 1
Daa unit 1Daa unit 1
Daa unit 1
jinalgoti
 
Droolsand Rule Based Systems 2008 Srping
Droolsand Rule Based Systems 2008 SrpingDroolsand Rule Based Systems 2008 Srping
Droolsand Rule Based Systems 2008 Srping
Srinath Perera
 
decison tree and rules in data mining techniques
decison tree and rules in data mining techniquesdecison tree and rules in data mining techniques
decison tree and rules in data mining techniques
ALIZAIB KHAN
 
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjcModule-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
shashashashashank
 
Ch02 primitive-data-definite-loops
Ch02 primitive-data-definite-loopsCh02 primitive-data-definite-loops
Ch02 primitive-data-definite-loops
James Brotsos
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
UNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.pptUNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.ppt
racha49
 
UNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.pptUNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.ppt
SamridhiGulati4
 
Introduction to Design Algorithm And Analysis.ppt
Introduction to Design Algorithm And Analysis.pptIntroduction to Design Algorithm And Analysis.ppt
Introduction to Design Algorithm And Analysis.ppt
BhargaviDalal4
 
Learning
LearningLearning
Learning
butest
 
A Fast Decision Rule Engine for Anomaly Detection
A Fast Decision Rule Engine for Anomaly DetectionA Fast Decision Rule Engine for Anomaly Detection
A Fast Decision Rule Engine for Anomaly Detection
Databricks
 
machine _learning_introductionand python.pptx
machine _learning_introductionand python.pptxmachine _learning_introductionand python.pptx
machine _learning_introductionand python.pptx
ChandrakalaV15
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
Qundeel
 
Data Structure
Data StructureData Structure
Data Structure
sheraz1
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
Qundeel
 
Design and analysis of algorithm in Computer Science
Design and analysis of algorithm in Computer ScienceDesign and analysis of algorithm in Computer Science
Design and analysis of algorithm in Computer Science
secularistpartyofind
 
Predictive analytics using 'R' Programming
Predictive analytics using 'R' ProgrammingPredictive analytics using 'R' Programming
Predictive analytics using 'R' Programming
ssusere796b3
 
White boxvsblackbox
White boxvsblackboxWhite boxvsblackbox
White boxvsblackbox
sanerjjd
 
Droolsand Rule Based Systems 2008 Srping
Droolsand Rule Based Systems 2008 SrpingDroolsand Rule Based Systems 2008 Srping
Droolsand Rule Based Systems 2008 Srping
Srinath Perera
 
decison tree and rules in data mining techniques
decison tree and rules in data mining techniquesdecison tree and rules in data mining techniques
decison tree and rules in data mining techniques
ALIZAIB KHAN
 
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjcModule-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
shashashashashank
 
Ch02 primitive-data-definite-loops
Ch02 primitive-data-definite-loopsCh02 primitive-data-definite-loops
Ch02 primitive-data-definite-loops
James Brotsos
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
UNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.pptUNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.ppt
racha49
 
Introduction to Design Algorithm And Analysis.ppt
Introduction to Design Algorithm And Analysis.pptIntroduction to Design Algorithm And Analysis.ppt
Introduction to Design Algorithm And Analysis.ppt
BhargaviDalal4
 
Learning
LearningLearning
Learning
butest
 
A Fast Decision Rule Engine for Anomaly Detection
A Fast Decision Rule Engine for Anomaly DetectionA Fast Decision Rule Engine for Anomaly Detection
A Fast Decision Rule Engine for Anomaly Detection
Databricks
 
machine _learning_introductionand python.pptx
machine _learning_introductionand python.pptxmachine _learning_introductionand python.pptx
machine _learning_introductionand python.pptx
ChandrakalaV15
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
Qundeel
 
Data Structure
Data StructureData Structure
Data Structure
sheraz1
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
Qundeel
 
Ad

More from ZHAO Sam (8)

Solr installation
Solr installationSolr installation
Solr installation
ZHAO Sam
 
Special issue on Technology Enhanced Learning
Special issue on Technology Enhanced LearningSpecial issue on Technology Enhanced Learning
Special issue on Technology Enhanced Learning
ZHAO Sam
 
国際会議推薦システムAcademic Conference Publishing System
国際会議推薦システムAcademic Conference Publishing System国際会議推薦システムAcademic Conference Publishing System
国際会議推薦システムAcademic Conference Publishing System
ZHAO Sam
 
祝大家新年快樂
祝大家新年快樂祝大家新年快樂
祝大家新年快樂
ZHAO Sam
 
Ubiquitous
UbiquitousUbiquitous
Ubiquitous
ZHAO Sam
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
ZHAO Sam
 
similarity measure
similarity measure similarity measure
similarity measure
ZHAO Sam
 
A Real-Time Interactive Shared System for Distance Learning
A Real-Time Interactive Shared System for Distance LearningA Real-Time Interactive Shared System for Distance Learning
A Real-Time Interactive Shared System for Distance Learning
ZHAO Sam
 
Solr installation
Solr installationSolr installation
Solr installation
ZHAO Sam
 
Special issue on Technology Enhanced Learning
Special issue on Technology Enhanced LearningSpecial issue on Technology Enhanced Learning
Special issue on Technology Enhanced Learning
ZHAO Sam
 
国際会議推薦システムAcademic Conference Publishing System
国際会議推薦システムAcademic Conference Publishing System国際会議推薦システムAcademic Conference Publishing System
国際会議推薦システムAcademic Conference Publishing System
ZHAO Sam
 
祝大家新年快樂
祝大家新年快樂祝大家新年快樂
祝大家新年快樂
ZHAO Sam
 
Ubiquitous
UbiquitousUbiquitous
Ubiquitous
ZHAO Sam
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
ZHAO Sam
 
similarity measure
similarity measure similarity measure
similarity measure
ZHAO Sam
 
A Real-Time Interactive Shared System for Distance Learning
A Real-Time Interactive Shared System for Distance LearningA Real-Time Interactive Shared System for Distance Learning
A Real-Time Interactive Shared System for Distance Learning
ZHAO Sam
 

Recently uploaded (20)

Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Developing System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptxDeveloping System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptx
wondimagegndesta
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Developing System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptxDeveloping System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptx
wondimagegndesta
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 

Covering (Rules-based) Algorithm

  • 1. Chapter 8 Covering (Rules-based) Algorithm Data Mining Technology
  • 2. Chapter 8 Covering (Rules-based) Algorithm Written by Shakhina Pulatova Presented by Zhao Xinyou [email_address] 2007.11.13 Data Mining Technology Some materials (Examples) are taken from Website.
  • 3. Contents What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application
  • 4. Introduction-App-1 PP87-88 Training Data Attributes Record Rules Rules given by people Rules generated by computer Setting 1.(1.75, 0) short 2. [1.75, 1.95) Medium 3. [1.95, ..) tall
  • 5. Introduction-App-2 PP87-88 How to get all tall people from B based on A A B + Training Data
  • 6. What is Rule-based Algorithm? Definition : Each classification method uses an algorithm to generate rules from the sample data. These rules are then applied to new data. Rule-based algorithm provide mechanisms that generate rules by 1. concentrating on a specific class at a time 2. maximizing the probability of the desired classification. PP87-88 Should be compact, easy-to-interpret, and accurate.
  • 7. Classification Rules- Straightforward If-Then rule Generating rules from Decision Tree PP88-89
  • 8. formal Specification of Rule-based Algorithm The classification r ules, r=<a, c>, consists of : a ( a ntecedent/precondition): a series of tests that be valuated as true or false ; c ( c onsequent/conclusion): the class or classes that apply to instances covered by rule r. PP88 a=0,b=0 a=0,b=1 a=1,b=0 a=1,b=1 a = x y c = a=0 b=0 b=0 yes no X X Y Y no no yes yes
  • 9. Remarks of Straightforward classification The a ntecedent contains a predicate that can be valuated as true or false against each tuple in database. These rules relate directly to corresponding decision tree (DT) that could be created. A DT can always be used to generate rules, but they are not equivalent. Differences: -the tree has a implied order in which the splitting is performed; rules have no order. -a tree is created based on looking at all classes; only one class must be examined at a time. PP88-89
  • 10. If-Then rule Straightforward way to perform classification is to generate if-then rules that cover all cases. 1 PP88
  • 11. Generating rules from Decision Tree -1-Con’ Decision Tree 2
  • 12. Generating rules from Decision Tree -2-Con’ y n a b c d x y y
  • 13. Generating rules from Decision Tree -3-Con’
  • 14. Remarks Rules may be more complex and incomprehensible from DT. A new test or rules need reshaping the whole tree Rules obtained without decision trees are more compact and accurate. So many other covering algorithms have been proposed. PP89-90 a b x y y c d x y y n n n n c d x y y n n c d x y y n n c d x y y n n duplicate subtrees a=0 b=0 b=0 yes no X X Y Y no no yes yes a=1 and c=0 Y
  • 15. Rule-based Classification Generate rules The 1R Algorithm / Learn One Rule The PRISM Algorithm Other Algorithm PP90
  • 16. Generating rules without Decision Trees-1-con’ Goal: find rules that identify the instances of a specific class Generate the “best” rule possible by optimizing the desired classification probability Usually, the “best” attribute-pair is chosen Remark -these technologies are also called covering algorithms because they attempt to generate rules which exactly cover a specific class.
  • 17. Generate Rules-Example-2-Con' Example 3 Question: We want to generate a rule to classify persons as tall. Basic format of the rule: if ? then class = tall Goal: replace “?” with predicates that can be used to obtain the “best” probability of being tall PP90
  • 18. Generate Rules-Algorithms-3-Con' 1.Generate rule R on training data S; 2.Remove the training data covered by rule R; 3. Repeat the process. PP90
  • 19. Generate Rules-Example-4-Con' Sequential Covering (I) Original data (ii) Step 1 r = NULL (iii) Step 2 R1 r = R1 (iii) Step 3 R1 R2 r = R1 U R2 (iii) Step 4 R1 R2 R3 r = R1 U R2 U R3 Wrong Class
  • 20. 1R Algorithm/ Learn One Rule-Con’ Simple and cheap method it only generates a one level decision tree. Classify an object on the basis of a single attribute. Idea: Rules will be constructed to test a single attribute and branch for every value of that attribute. For each branch, the class with the test classification is the one occurring PP91
  • 21. 1R Algorithm/ Learn One Rule-Con’ Idea : 1. Rules will be constructed to test a single attribute and branch for every value of that attribute. Step 2. For each branch, the class with the test classification is the one occurring. 3. Find one biggest number as rules 4. Error rate will be evaluated. 5. The minimum error rate will be chosen. PP91 M->T Error=5 F->M Error=3 Total Error=8 Total Error=3 Total Error=.. A2 An Gender F 2 5 1 S M T M 1 4 10 S M T
  • 22. 1R Algorithm Input: D //Training Data T //Attributes to consider for rules C //Classes Output : R //Rules ALgorithm : R=Φ; for all A in T do R A =Φ; for all possbile value, v, of A do for all C j ∈C do find count(C j ) end for let C m be the class with the largest count; R A =R A ((A=v) ->(class= C m )); end for ERR A =number of tuples incorrectly classified by R A ; e nd for R=R A where ERR A is minimum T={Gender, Height} D C={{F, M}, {0, ∞}} C1 C2 Training Data Gender F M Short Medium Tall 3 6 0 Short Medium Tall 1 2 3 R1=F->medium R2=M->tall Height
  • 23. Example 5 – 1R-3-Con’ Rules based on height … ... … 0/2 0/2 0/3 0/4 1/2 0/2 3/9 3/6 Error 1/15 (0 , 1.6]-> short (1.6, 1.7]->short (1.7, 1.8]-> medium (1.8, 1.9]-> medium (1.9, 2.0]-> medium (2.0, ∞ ]-> tall Height (Step=0.1) 2 6/15 F->medium M->tall Gender 1 Total Error Rules Attribute Option
  • 24. Example 6 -1R PP92-93 5/14 2/8 3/6 False->yes True->no windy 4 4/14 3/7 1/7 High->no Normal->yes humidity 3 2/4 2/6 1/4 2/5 0/4 2/5 Error 5/14 Hot->no Mild->yes Cool->yes temperature 2 4/14 Sunny->no Overcast->yes Rainy->yes outlook 1 Total Error Rules Attribute Rules based on humidity OR High->no Normal->yes Rules based on outlook Sunny->no Overcast->yes Rainy->yes
  • 25. PRISM Algorithm-Con’ PRISM generate rules for each class by looking at the training data and adding rules that completely describe all tuples in that class. Generates only correct or perfect rules: the accuracy of so-constructed PRISM is 100%. Measures the success of a rule by a p/t, where -p is number of positive instance, -T is total number of instance covered by the rule. Gender=Male P=10, T=10 Gender=Female P=1 T=8 R=Gender = Male …… A2 An Gender F 2 5 1 S M T M 0 0 10 S M T
  • 26. PRISM Algorithm Step Input D and C (Attribute -> Value) 1.Compute all class P/T (Attribute->Value) 2. Find one or more pair of (Attribute->Value) P/T = 100% 3. Select (Attribute->Value) as Rule 4. Repeat 1-3 until no data in D Input: D //Training Data C //Classes Output: R //Rules
  • 27. Example 8-Con’-which class may be tall? Compute the value p / t Which one is 100% PP94-95 0/9 Gender = F 1 2/2 2.0< Height 8 ½ 1.9< Height ≤ 2.0 7 0/4 1.8< Height ≤ 1.9 6 0/3 1.7< Height ≤ 1.8 5 0/2 1.6< Height ≤ 1.7 4 0/2 Height ≤ 1.6 3 3/6 Gender = M 2 p / t (Attribute, value) Num R1 = 2.0< Height
  • 28. R2 = 1.95< Height ≤ 2.0 R = R1 U R2 PP94-96 … … … 1/1 1.95< Height ≤ 2.0 0/1 1.9< Height ≤ 1.95 p / t (Attribute, value) Num
  • 29. Example 9-Con’-which days may play? The predicate outlook=overcast correctly implies play=yes on all four rows R1 =if outlook=overcast, then play=yes Compute the value p / t
  • 30. Example 8-Con’ R2= if humidity=normal and windy=false, then play=yes
  • 31. Example 8-Con’ R3 =….. R = R1 U R2 U R3 U…
  • 32. Application of Covering Algorithm To derive classification rules applied for diagnosing illness, business planning, banking, government. Machine learning Text classification. But to photos, it is difficult… And so on.
  • 33. Application on E-learning/M-learning Adaptive and personalized learning materials Virtual Group Classification Initial Learner’s information Classification of learning styles or some Provide adaptive and personalized materials Collect learning styles feedback Chapter 2 or 3 Similarity, Bayesian… Rule-based algorithm
  翻译: