SlideShare a Scribd company logo
IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 5, 2013 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 1060
Abstract-- In this paper, we are proposing a modified
algorithm for classification. This algorithm is based on the
concept of the decision trees. The proposed algorithm is
better then the previous algorithms. It provides more
accurate results. We have tested the proposed method on the
example of patient data set. Our proposed methodology uses
greedy approach to select the best attribute. To do so the
information gain is used. The attribute with highest
information gain is selected. If information gain is not good
then again divide attributes values into groups. These steps
are done until we get good classification/misclassification
ratio. The proposed algorithms classify the data sets more
accurately and efficiently.
I. INTRODUCTION
Decision trees: - The well-known machine learning
techniques. A decision tree is composed of three basic
elements:
1 A decision node specifying a test attributes.
2 An edge or a branch corresponding to the one of
the possible attribute values which means one of
the test attribute outcomes.
3 A leaf which is also named an answer node
contains the class to which the object belongs.
In decision trees, two major phases should be ensured:
1. Building the tree: Based on a given training set, a
decision tree is built. It consists of selecting for
each decision node the ‗Appropriate ‘test attribute
and also to define the class labelling each leaf.
2. Classification: In order to classify a new instance,
we start by the root of the decision tree, then we
test the attribute specified by this node. The result
of this test allows moving down the tree branch
relative to the attribute value of the given instance.
This process will be repeated until a leaf is
encountered. The instance is then being classified
in the same class as the one characterizing the
reached leaf.
Decision trees have also been used for intrusion detection
[3]. The decision trees select the best features for each
decision node during the construction of the tree based on
some well-defined criteria. One such criterion is to use the
information gain ratio. [2]
II. RELATED WORK
Naïve bays classifier is also a very good and accurate
method for the data classification. Naive Bayes classifier
[17] is a probabilistic classifier based on the Bayes theorem,
considering strong (Naive) independence assumption. Thus,
a Naive Bayes classifier believes that all attributes (features)
independently contribute to the probability of a certain
decision. Considering the characteristics of the underlying
probability model, the Naive Bayes classifier can be trained
very efficiently in a supervised learning setting. This could
yield much better results in many complex real-world
situations, especially in the field of computer-aided
diagnosis [16] [17]. Here it is assumed that all variables are
independent. Hence only the variances of the variables for
each class need to be determined and not the entire
covariance matrix.
The RnD tree is a modern method for data
classification. Accurate results provided by this method are
also attracting so many researchers to this method. The RnD
tree [18] algorithm can be applied to both classification and
regression problems. Random trees are a collection or
assembly of tree predictors that is called forest [18]. The
classification works as follows: the random trees classifier
takes the input feature vector, classifies it with every tree in
the forest, and outputs the class label that received the
majority of “votes”. In the case of regression the classifier
response is the average of the responses over all the trees in
the forest.
A recursive Bayesian classifier is introduced in [7].
Lots of improvement is already done on decision tree
induction method for 100 % accuracy and many of them
achieved the goal also but main problem on these improved
methods is that they required lots of time and complex
extracted rules. The main idea is to split the data recursively
into partitions where the conditional independence
assumption holds. A decision tree is a mapping from
observations about an item to conclusions about its target
value [9, 10, 11,12 and 13]. Decision trees are commonly
used in operations research, specifically in decision analysis,
to help identify a strategy most likely to reach a goal.
Another use of decision trees is as a descriptive means for
calculating conditional probabilities. A decision tree (or tree
diagram) is a decision support tool that uses a tree-like
graph or model of decisions and their possible
consequences, including chance event outcomes, resource
costs, and utility [14]. Decision tree Induction Method has
been successfully used in expert systems in capturing
knowledge. Decision tree induction Method is good for
multiple attribute Data sets.
III. PROPOSED SOLUTION
[18]Classification is a form of data analysis that extracts
models describing important data classes. These models also
called as classifiers are used to predict categorical (discrete,
unordered)class labels. This analysis can help us for better
understanding of large data. Classification has numerous
applications, including fraud detection, target marketing,
A Decision Tree Based Classifier for Classification & Prediction of
Diseases
Bhupendra Patidar1
Prof. Gajendra Singh2
1,2
SSSIST, Sehore (MP)
A Decision Tree Based Classifier for Classification & Prediction of Diseases
(IJSRD/Vol. 1/Issue 5/2013/0003)
All rights reserved by www.ijsrd.com 1061
performance prediction, manufacturing, credit risk and
medical diagnosis. Data Classification is a two-step process.
They are: Learning Step and Classification Step
A. Learning Step:
In this step classification model is constructed. A classifier
is built describing a predetermined set of data classes or
concepts. In learning step or training phase, where
classification algorithm builds the classifier by analyzing or
“learning from” a training set made up of database tuples
and their associated class labels.
This step is also known as supervised learning as the class
label of each training tuple is provided. This learning of the
classifier is “supervised” by telling to which class each
training tuple belongs. In unsupervised learning or
clustering, the class label of each training tuple is not
known, and the number or set of classes to be learned may
not be known in advance.
B. Classification Step:
In this step, the model is used to predict class labels for
given data and it is used for classification. First, the
predictive accuracy of the classifier is estimated. To
measure the classifiers accuracy, if we use the training set it
would be optimistic, because the classifier tends to over fit
the data i.e., during learning it may incorporate some
particular anomalies of the training data that are not present
in the general data set. Therefore, a test set is used, made up
of the test tuples and their associated class labels., They are
independent of the training tuples, from which the classifier
cannot be constructed. The accuracy of a classifier on a
given test set is the percentage of test tuples that are
correctly classified by the classifier. The associated class
label of each test tuple is compared with the learned
classifier’s class prediction for the tuple. If the accuracy of
the model or classifier is considered acceptable, the model
can be used to classify future data tuples or objects for
which the class label is not known.
C. Decision Tree Induction:
A decision tree is a flow-chart-like tree structure, where
each internal node denotes a test on an attribute, each branch
represents an outcome of the test, and leaf nodes represent
classes or class distributions. The topmost node in a tree is
the root node.
Given a tuple K, for which the associated class
label is unknown, the attribute values of the tuple are tested
against the decision tree. A path is traced from the root to a
leaf node, which holds the class predicate for that tuple.
Decision trees are easily converted to classification rules.
The construction of decision does not require any domain
knowledge or parameter setting. It can handle high
dimension data. The learning and classification steps are
simple and fast. It has good accuracy. Decision tree
Induction algorithm can be used in many applications like
medicine, manufacturing and production etc.
IV. PROPOSED METHOD:
DTC (in T: table; C: classification attribute) return decision
tree
{
if (T is empty) then return(null); /* Base case 0 */
N: = a new node; if (there are no predictive attributes in T)
/* Base case 1 */
}
Then label N with most common value of C in T
(deterministic tree) or with frequencies of C in T
(probabilistic tree)
else if (all instances in T have the same value V of C) /*
Base case 2 */ then label N, “X.C=V with probability 1”
else { for each attribute A in T compute AVG
ENTROPY(A,C,T); AS := the attribute for which AVG
ENTROPY(AS,C,T) is minimal; if (AVG
ENTROPY(AS,C,T) is not substantially smaller than
ENTROPY(C,T)) /* Base case 3 */
Then label N with most common value of C in T
(deterministic tree) or with frequencies of C in T
(probabilistic tree).
Else {label N with AS; for each value V of AS do {
} } return N;
N1:= DTC (SUBTABLE (T,A,V),C) /* Recursive call */ if
(N1 != null) then make an arc from N to N1 labeled V; }
SUBTABLE (in T : table; A : predictive attribute; V : value)
return table; { T1 := the set of instance X in T such that X.A
= V;
}
T1 := delete column A from T1; return T1
/* Note: in the textbook this is called I(p(v
1
) . . . p(v
)) */ ENTROPY (in C : classification attribute; T : table)
return real number; { for each value V of C, let p(V) :=
FREQUENCY(C,V,T);
Return Vp(V ) log2k
(p(V )) /* By convention, we consider 0 · log
(0) to be 0. */ }
/* Note; In the textbook this is called “Remainder (A)” */
AVG ENTROPY (in A: predictive attribute; C :
classification attribute; T : table)
Return real number; {return V
FREQUENCY (A, V, T) · ENTROPY(C, SUBTABLE (T,
A, V))}
FREQUENCY (in B : attribute; V : value; T : table) return
real number; { return #{ X in T | X.B=V } / size(T); }
V. CONCLUSION
In this paper, we have proposed a modified algorithm for
classification. This algorithm is based on the concept of the
decision trees. The proposed algorithm is better then the
previous algorithms. It provides more accurate results. We
have tested the proposed method on the example of patient
data set.
REFERENCES
[1] Singh Vijendra. Efficient Clustering For High
Dimensional Data: Subspace Based Clustering and
Density Based Clustering, Information Technology
Journal; 2011, 10(6), pp. 1092-1105.
[2] D Breiman, L., Friedman, J. H., Olshen, R. A., and
Stone, C. J.“Classification and Regression Trees”.
Wadsworth International Group. Belmont, CA: The
Wadsworth Statistics/Probability Series1984.
A Decision Tree Based Classifier for Classification & Prediction of Diseases
(IJSRD/Vol. 1/Issue 5/2013/0003)
All rights reserved by www.ijsrd.com 1062
[3] Quinlan, J. R. “Induction of Decision Trees”. Machine
Learning; 1986,pp. 81-106.
[4] Quinlan, J. R. Simplifying “Decision Trees. International
Journal of Man-Machine Studies" ;1987, 27:pp. 221-
234.
[5] Gama, J. and Brazdil, P. “Linear Tree. Intelligent
DataAnalysis”,1999,.3(1): pp. 1-22.
[6] Langley, P. “Induction of Recursive Bayesian
Classifiers”. In BrazdilP.B. (ed.), Machine Learning:
ECML-93;1993, pp. 153-164.
Springer,Berlin/Heidelberg~lew York/Tokyo.
[7] Witten, I. & Frank, E,"Data Mining: Practical machine
learning toolsand techniques", 2nd Edition, Morgan
Kaufmann, San Francisco, 2005.ch. 3,4, pp 45-100.
[8] Yang, Y., Webb, G. “On Why Discretization Works for
Naive-BayesClassifiers”, Lecture Notes in Computer
Science, vol. 2003, pp. 440 –452.
[9] H. Zantema and H. L. Bodlaender, “Finding Small
Equivalent Decision Trees is Hard”, International
Journal of Foundations of Computer Science; 2000,
11(2):343-354.
[10] Huang Ming, Niu Wenying and Liang Xu , “An
improved Decision Tree classification algorithm based
on ID3 and the application in score analysis”, Software
Technol. Inst., Dalian Jiao Tong Univ., Dalian, China,
June 2009.
[11] Chai Rui-min and Wang Miao, “A more efficient
classification scheme for ID3”,Sch. of Electron. & Inf.
Eng., Liaoning Tech. Univ., Huludao, China;
2010,Version1, pp. 329-345.
[12] Iu Yuxun and Xie Niuniu “Improved ID3
algorithm”,Coll. of Inf. Sci. & Eng., Henan Univ. of
Technol., Zhengzhou, China;2010,pp. ;465-573.
[13] Chen Jin, Luo De-lin and Mu Fen-xiang,” An im pr
oved ID3 decision tree algorithm”,Sch. of Inf. Sci. &
Technol., Xiamen Univ., Xiamen, China, page; 2009,
pp. 127-134.
[14] Jiawei Han and Micheline Kamber, “Data Mining:
Concepts and Techniques”, 2nd edition, Morgan
Kaufmann, 2006, ch-3, pp. 102-130.
Ad

More Related Content

What's hot (20)

Classification
ClassificationClassification
Classification
Dr. C.V. Suresh Babu
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
hktripathy
 
Ch06
Ch06Ch06
Ch06
maheswari narne
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
thamizh arasi
 
Tree pruning
Tree pruningTree pruning
Tree pruning
priya_kalia
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Clustering
ClusteringClustering
Clustering
Dr. C.V. Suresh Babu
 
Data mining chapter04and5-best
Data mining chapter04and5-bestData mining chapter04and5-best
Data mining chapter04and5-best
ABDUmomo
 
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTIONINFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
IJDKP
 
report.doc
report.docreport.doc
report.doc
butest
 
PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data Analysis
Aashish Patel
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
Valerii Klymchuk
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methods
rajshreemuthiah
 
Classification By Clustering Based On Adjusted Cluster
Classification By Clustering Based On Adjusted ClusterClassification By Clustering Based On Adjusted Cluster
Classification By Clustering Based On Adjusted Cluster
IOSR Journals
 
Dma unit 2
Dma unit  2Dma unit  2
Dma unit 2
thamizh arasi
 
Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learning
Francisco E. Figueroa-Nigaglioni
 
Program_Cluster_Analysis
Program_Cluster_AnalysisProgram_Cluster_Analysis
Program_Cluster_Analysis
Sammya Sengupta
 
Associative Classification: Synopsis
Associative Classification: SynopsisAssociative Classification: Synopsis
Associative Classification: Synopsis
Jagdeep Singh Malhi
 
Cluster Analysis Assignment 2013-2014(2)
Cluster Analysis Assignment 2013-2014(2)Cluster Analysis Assignment 2013-2014(2)
Cluster Analysis Assignment 2013-2014(2)
TIEZHENG YUAN
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
hktripathy
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
thamizh arasi
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Data mining chapter04and5-best
Data mining chapter04and5-bestData mining chapter04and5-best
Data mining chapter04and5-best
ABDUmomo
 
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTIONINFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
IJDKP
 
report.doc
report.docreport.doc
report.doc
butest
 
PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data Analysis
Aashish Patel
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methods
rajshreemuthiah
 
Classification By Clustering Based On Adjusted Cluster
Classification By Clustering Based On Adjusted ClusterClassification By Clustering Based On Adjusted Cluster
Classification By Clustering Based On Adjusted Cluster
IOSR Journals
 
Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learning
Francisco E. Figueroa-Nigaglioni
 
Program_Cluster_Analysis
Program_Cluster_AnalysisProgram_Cluster_Analysis
Program_Cluster_Analysis
Sammya Sengupta
 
Associative Classification: Synopsis
Associative Classification: SynopsisAssociative Classification: Synopsis
Associative Classification: Synopsis
Jagdeep Singh Malhi
 
Cluster Analysis Assignment 2013-2014(2)
Cluster Analysis Assignment 2013-2014(2)Cluster Analysis Assignment 2013-2014(2)
Cluster Analysis Assignment 2013-2014(2)
TIEZHENG YUAN
 

Similar to A Decision Tree Based Classifier for Classification & Prediction of Diseases (20)

dataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptxdataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptx
AsrithaKorupolu
 
Chapter4-ML.pptx slide for concept of mechanic learning
Chapter4-ML.pptx slide  for concept of mechanic learningChapter4-ML.pptx slide  for concept of mechanic learning
Chapter4-ML.pptx slide for concept of mechanic learning
Hina636704
 
Comprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction TechniquesComprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction Techniques
ijsrd.com
 
classification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdfclassification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdf
321106410027
 
Machine learning Chapter three (16).pptx
Machine learning Chapter three (16).pptxMachine learning Chapter three (16).pptx
Machine learning Chapter three (16).pptx
jamsibro140
 
A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniques
ijsrd.com
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
Nandakumar P
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
Datamining Tools
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
DataminingTools Inc
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
guest0edcaf
 
Classifiers
ClassifiersClassifiers
Classifiers
Ayurdata
 
IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292
HARDIK SINGH
 
Hx3115011506
Hx3115011506Hx3115011506
Hx3115011506
IJERA Editor
 
Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)
eSAT Journals
 
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
ijcnes
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
butest
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
kevinlan
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
jim
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
Tonmoy Bhagawati
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
IRJET Journal
 
dataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptxdataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptx
AsrithaKorupolu
 
Chapter4-ML.pptx slide for concept of mechanic learning
Chapter4-ML.pptx slide  for concept of mechanic learningChapter4-ML.pptx slide  for concept of mechanic learning
Chapter4-ML.pptx slide for concept of mechanic learning
Hina636704
 
Comprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction TechniquesComprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction Techniques
ijsrd.com
 
classification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdfclassification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdf
321106410027
 
Machine learning Chapter three (16).pptx
Machine learning Chapter three (16).pptxMachine learning Chapter three (16).pptx
Machine learning Chapter three (16).pptx
jamsibro140
 
A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniques
ijsrd.com
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
Nandakumar P
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
Datamining Tools
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
guest0edcaf
 
Classifiers
ClassifiersClassifiers
Classifiers
Ayurdata
 
IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292
HARDIK SINGH
 
Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)
eSAT Journals
 
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
ijcnes
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
butest
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
kevinlan
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
jim
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
Tonmoy Bhagawati
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
IRJET Journal
 
Ad

More from ijsrd.com (20)

IoT Enabled Smart Grid
IoT Enabled Smart GridIoT Enabled Smart Grid
IoT Enabled Smart Grid
ijsrd.com
 
A Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of ThingsA Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of Things
ijsrd.com
 
IoT for Everyday Life
IoT for Everyday LifeIoT for Everyday Life
IoT for Everyday Life
ijsrd.com
 
Study on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTStudy on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOT
ijsrd.com
 
Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...
ijsrd.com
 
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
ijsrd.com
 
A Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's LifeA Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's Life
ijsrd.com
 
Pedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language LearningPedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language Learning
ijsrd.com
 
Virtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation SystemVirtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation System
ijsrd.com
 
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
ijsrd.com
 
Understanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart RefrigeratorUnderstanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart Refrigerator
ijsrd.com
 
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
ijsrd.com
 
A Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingA Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processing
ijsrd.com
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
ijsrd.com
 
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMAPPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
ijsrd.com
 
Making model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point TrackingMaking model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point Tracking
ijsrd.com
 
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
ijsrd.com
 
Study and Review on Various Current Comparators
Study and Review on Various Current ComparatorsStudy and Review on Various Current Comparators
Study and Review on Various Current Comparators
ijsrd.com
 
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
ijsrd.com
 
Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.
ijsrd.com
 
IoT Enabled Smart Grid
IoT Enabled Smart GridIoT Enabled Smart Grid
IoT Enabled Smart Grid
ijsrd.com
 
A Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of ThingsA Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of Things
ijsrd.com
 
IoT for Everyday Life
IoT for Everyday LifeIoT for Everyday Life
IoT for Everyday Life
ijsrd.com
 
Study on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTStudy on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOT
ijsrd.com
 
Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...
ijsrd.com
 
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
ijsrd.com
 
A Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's LifeA Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's Life
ijsrd.com
 
Pedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language LearningPedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language Learning
ijsrd.com
 
Virtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation SystemVirtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation System
ijsrd.com
 
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
ijsrd.com
 
Understanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart RefrigeratorUnderstanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart Refrigerator
ijsrd.com
 
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
ijsrd.com
 
A Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingA Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processing
ijsrd.com
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
ijsrd.com
 
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMAPPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
ijsrd.com
 
Making model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point TrackingMaking model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point Tracking
ijsrd.com
 
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
ijsrd.com
 
Study and Review on Various Current Comparators
Study and Review on Various Current ComparatorsStudy and Review on Various Current Comparators
Study and Review on Various Current Comparators
ijsrd.com
 
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
ijsrd.com
 
Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.
ijsrd.com
 
Ad

Recently uploaded (20)

Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
Guru Nanak Technical Institutions
 
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdfML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
rameshwarchintamani
 
Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
Machine foundation notes for civil engineering students
Machine foundation notes for civil engineering studentsMachine foundation notes for civil engineering students
Machine foundation notes for civil engineering students
DYPCET
 
Modeling the Influence of Environmental Factors on Concrete Evaporation Rate
Modeling the Influence of Environmental Factors on Concrete Evaporation RateModeling the Influence of Environmental Factors on Concrete Evaporation Rate
Modeling the Influence of Environmental Factors on Concrete Evaporation Rate
Journal of Soft Computing in Civil Engineering
 
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Journal of Soft Computing in Civil Engineering
 
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
Jimmy Lai
 
Construction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil EngineeringConstruction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil Engineering
Lavish Kashyap
 
vtc2018fall_otfs_tutorial_presentation_1.pdf
vtc2018fall_otfs_tutorial_presentation_1.pdfvtc2018fall_otfs_tutorial_presentation_1.pdf
vtc2018fall_otfs_tutorial_presentation_1.pdf
RaghavaGD1
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdfML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
rameshwarchintamani
 
Automatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and BeyondAutomatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and Beyond
NU_I_TODALAB
 
AI Chatbots & Software Development Teams
AI Chatbots & Software Development TeamsAI Chatbots & Software Development Teams
AI Chatbots & Software Development Teams
Joe Krall
 
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdfIBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
VigneshPalaniappanM
 
Slide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptxSlide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptx
vvsasane
 
Working with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to ImplementationWorking with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to Implementation
Alabama Transportation Assistance Program
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdfML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
rameshwarchintamani
 
Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
Machine foundation notes for civil engineering students
Machine foundation notes for civil engineering studentsMachine foundation notes for civil engineering students
Machine foundation notes for civil engineering students
DYPCET
 
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
Jimmy Lai
 
Construction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil EngineeringConstruction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil Engineering
Lavish Kashyap
 
vtc2018fall_otfs_tutorial_presentation_1.pdf
vtc2018fall_otfs_tutorial_presentation_1.pdfvtc2018fall_otfs_tutorial_presentation_1.pdf
vtc2018fall_otfs_tutorial_presentation_1.pdf
RaghavaGD1
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdfML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
rameshwarchintamani
 
Automatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and BeyondAutomatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and Beyond
NU_I_TODALAB
 
AI Chatbots & Software Development Teams
AI Chatbots & Software Development TeamsAI Chatbots & Software Development Teams
AI Chatbots & Software Development Teams
Joe Krall
 
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdfIBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
VigneshPalaniappanM
 
Slide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptxSlide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptx
vvsasane
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 

A Decision Tree Based Classifier for Classification & Prediction of Diseases

  • 1. IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 5, 2013 | ISSN (online): 2321-0613 All rights reserved by www.ijsrd.com 1060 Abstract-- In this paper, we are proposing a modified algorithm for classification. This algorithm is based on the concept of the decision trees. The proposed algorithm is better then the previous algorithms. It provides more accurate results. We have tested the proposed method on the example of patient data set. Our proposed methodology uses greedy approach to select the best attribute. To do so the information gain is used. The attribute with highest information gain is selected. If information gain is not good then again divide attributes values into groups. These steps are done until we get good classification/misclassification ratio. The proposed algorithms classify the data sets more accurately and efficiently. I. INTRODUCTION Decision trees: - The well-known machine learning techniques. A decision tree is composed of three basic elements: 1 A decision node specifying a test attributes. 2 An edge or a branch corresponding to the one of the possible attribute values which means one of the test attribute outcomes. 3 A leaf which is also named an answer node contains the class to which the object belongs. In decision trees, two major phases should be ensured: 1. Building the tree: Based on a given training set, a decision tree is built. It consists of selecting for each decision node the ‗Appropriate ‘test attribute and also to define the class labelling each leaf. 2. Classification: In order to classify a new instance, we start by the root of the decision tree, then we test the attribute specified by this node. The result of this test allows moving down the tree branch relative to the attribute value of the given instance. This process will be repeated until a leaf is encountered. The instance is then being classified in the same class as the one characterizing the reached leaf. Decision trees have also been used for intrusion detection [3]. The decision trees select the best features for each decision node during the construction of the tree based on some well-defined criteria. One such criterion is to use the information gain ratio. [2] II. RELATED WORK Naïve bays classifier is also a very good and accurate method for the data classification. Naive Bayes classifier [17] is a probabilistic classifier based on the Bayes theorem, considering strong (Naive) independence assumption. Thus, a Naive Bayes classifier believes that all attributes (features) independently contribute to the probability of a certain decision. Considering the characteristics of the underlying probability model, the Naive Bayes classifier can be trained very efficiently in a supervised learning setting. This could yield much better results in many complex real-world situations, especially in the field of computer-aided diagnosis [16] [17]. Here it is assumed that all variables are independent. Hence only the variances of the variables for each class need to be determined and not the entire covariance matrix. The RnD tree is a modern method for data classification. Accurate results provided by this method are also attracting so many researchers to this method. The RnD tree [18] algorithm can be applied to both classification and regression problems. Random trees are a collection or assembly of tree predictors that is called forest [18]. The classification works as follows: the random trees classifier takes the input feature vector, classifies it with every tree in the forest, and outputs the class label that received the majority of “votes”. In the case of regression the classifier response is the average of the responses over all the trees in the forest. A recursive Bayesian classifier is introduced in [7]. Lots of improvement is already done on decision tree induction method for 100 % accuracy and many of them achieved the goal also but main problem on these improved methods is that they required lots of time and complex extracted rules. The main idea is to split the data recursively into partitions where the conditional independence assumption holds. A decision tree is a mapping from observations about an item to conclusions about its target value [9, 10, 11,12 and 13]. Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal. Another use of decision trees is as a descriptive means for calculating conditional probabilities. A decision tree (or tree diagram) is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility [14]. Decision tree Induction Method has been successfully used in expert systems in capturing knowledge. Decision tree induction Method is good for multiple attribute Data sets. III. PROPOSED SOLUTION [18]Classification is a form of data analysis that extracts models describing important data classes. These models also called as classifiers are used to predict categorical (discrete, unordered)class labels. This analysis can help us for better understanding of large data. Classification has numerous applications, including fraud detection, target marketing, A Decision Tree Based Classifier for Classification & Prediction of Diseases Bhupendra Patidar1 Prof. Gajendra Singh2 1,2 SSSIST, Sehore (MP)
  • 2. A Decision Tree Based Classifier for Classification & Prediction of Diseases (IJSRD/Vol. 1/Issue 5/2013/0003) All rights reserved by www.ijsrd.com 1061 performance prediction, manufacturing, credit risk and medical diagnosis. Data Classification is a two-step process. They are: Learning Step and Classification Step A. Learning Step: In this step classification model is constructed. A classifier is built describing a predetermined set of data classes or concepts. In learning step or training phase, where classification algorithm builds the classifier by analyzing or “learning from” a training set made up of database tuples and their associated class labels. This step is also known as supervised learning as the class label of each training tuple is provided. This learning of the classifier is “supervised” by telling to which class each training tuple belongs. In unsupervised learning or clustering, the class label of each training tuple is not known, and the number or set of classes to be learned may not be known in advance. B. Classification Step: In this step, the model is used to predict class labels for given data and it is used for classification. First, the predictive accuracy of the classifier is estimated. To measure the classifiers accuracy, if we use the training set it would be optimistic, because the classifier tends to over fit the data i.e., during learning it may incorporate some particular anomalies of the training data that are not present in the general data set. Therefore, a test set is used, made up of the test tuples and their associated class labels., They are independent of the training tuples, from which the classifier cannot be constructed. The accuracy of a classifier on a given test set is the percentage of test tuples that are correctly classified by the classifier. The associated class label of each test tuple is compared with the learned classifier’s class prediction for the tuple. If the accuracy of the model or classifier is considered acceptable, the model can be used to classify future data tuples or objects for which the class label is not known. C. Decision Tree Induction: A decision tree is a flow-chart-like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and leaf nodes represent classes or class distributions. The topmost node in a tree is the root node. Given a tuple K, for which the associated class label is unknown, the attribute values of the tuple are tested against the decision tree. A path is traced from the root to a leaf node, which holds the class predicate for that tuple. Decision trees are easily converted to classification rules. The construction of decision does not require any domain knowledge or parameter setting. It can handle high dimension data. The learning and classification steps are simple and fast. It has good accuracy. Decision tree Induction algorithm can be used in many applications like medicine, manufacturing and production etc. IV. PROPOSED METHOD: DTC (in T: table; C: classification attribute) return decision tree { if (T is empty) then return(null); /* Base case 0 */ N: = a new node; if (there are no predictive attributes in T) /* Base case 1 */ } Then label N with most common value of C in T (deterministic tree) or with frequencies of C in T (probabilistic tree) else if (all instances in T have the same value V of C) /* Base case 2 */ then label N, “X.C=V with probability 1” else { for each attribute A in T compute AVG ENTROPY(A,C,T); AS := the attribute for which AVG ENTROPY(AS,C,T) is minimal; if (AVG ENTROPY(AS,C,T) is not substantially smaller than ENTROPY(C,T)) /* Base case 3 */ Then label N with most common value of C in T (deterministic tree) or with frequencies of C in T (probabilistic tree). Else {label N with AS; for each value V of AS do { } } return N; N1:= DTC (SUBTABLE (T,A,V),C) /* Recursive call */ if (N1 != null) then make an arc from N to N1 labeled V; } SUBTABLE (in T : table; A : predictive attribute; V : value) return table; { T1 := the set of instance X in T such that X.A = V; } T1 := delete column A from T1; return T1 /* Note: in the textbook this is called I(p(v 1 ) . . . p(v )) */ ENTROPY (in C : classification attribute; T : table) return real number; { for each value V of C, let p(V) := FREQUENCY(C,V,T); Return Vp(V ) log2k (p(V )) /* By convention, we consider 0 · log (0) to be 0. */ } /* Note; In the textbook this is called “Remainder (A)” */ AVG ENTROPY (in A: predictive attribute; C : classification attribute; T : table) Return real number; {return V FREQUENCY (A, V, T) · ENTROPY(C, SUBTABLE (T, A, V))} FREQUENCY (in B : attribute; V : value; T : table) return real number; { return #{ X in T | X.B=V } / size(T); } V. CONCLUSION In this paper, we have proposed a modified algorithm for classification. This algorithm is based on the concept of the decision trees. The proposed algorithm is better then the previous algorithms. It provides more accurate results. We have tested the proposed method on the example of patient data set. REFERENCES [1] Singh Vijendra. Efficient Clustering For High Dimensional Data: Subspace Based Clustering and Density Based Clustering, Information Technology Journal; 2011, 10(6), pp. 1092-1105. [2] D Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J.“Classification and Regression Trees”. Wadsworth International Group. Belmont, CA: The Wadsworth Statistics/Probability Series1984.
  • 3. A Decision Tree Based Classifier for Classification & Prediction of Diseases (IJSRD/Vol. 1/Issue 5/2013/0003) All rights reserved by www.ijsrd.com 1062 [3] Quinlan, J. R. “Induction of Decision Trees”. Machine Learning; 1986,pp. 81-106. [4] Quinlan, J. R. Simplifying “Decision Trees. International Journal of Man-Machine Studies" ;1987, 27:pp. 221- 234. [5] Gama, J. and Brazdil, P. “Linear Tree. Intelligent DataAnalysis”,1999,.3(1): pp. 1-22. [6] Langley, P. “Induction of Recursive Bayesian Classifiers”. In BrazdilP.B. (ed.), Machine Learning: ECML-93;1993, pp. 153-164. Springer,Berlin/Heidelberg~lew York/Tokyo. [7] Witten, I. & Frank, E,"Data Mining: Practical machine learning toolsand techniques", 2nd Edition, Morgan Kaufmann, San Francisco, 2005.ch. 3,4, pp 45-100. [8] Yang, Y., Webb, G. “On Why Discretization Works for Naive-BayesClassifiers”, Lecture Notes in Computer Science, vol. 2003, pp. 440 –452. [9] H. Zantema and H. L. Bodlaender, “Finding Small Equivalent Decision Trees is Hard”, International Journal of Foundations of Computer Science; 2000, 11(2):343-354. [10] Huang Ming, Niu Wenying and Liang Xu , “An improved Decision Tree classification algorithm based on ID3 and the application in score analysis”, Software Technol. Inst., Dalian Jiao Tong Univ., Dalian, China, June 2009. [11] Chai Rui-min and Wang Miao, “A more efficient classification scheme for ID3”,Sch. of Electron. & Inf. Eng., Liaoning Tech. Univ., Huludao, China; 2010,Version1, pp. 329-345. [12] Iu Yuxun and Xie Niuniu “Improved ID3 algorithm”,Coll. of Inf. Sci. & Eng., Henan Univ. of Technol., Zhengzhou, China;2010,pp. ;465-573. [13] Chen Jin, Luo De-lin and Mu Fen-xiang,” An im pr oved ID3 decision tree algorithm”,Sch. of Inf. Sci. & Technol., Xiamen Univ., Xiamen, China, page; 2009, pp. 127-134. [14] Jiawei Han and Micheline Kamber, “Data Mining: Concepts and Techniques”, 2nd edition, Morgan Kaufmann, 2006, ch-3, pp. 102-130.
  翻译: