About Decision Tree Algorithms...
What is Decision Tree?
Why Decision trees?
We have a couple of other algorithms there, so why do we have to choose Decision trees??
well, there might be many reasons but I believe a few which are
For example: if we are classifying a bank loan application for a customer, the decision tree may look like this
Here we can see the logic of how it is making the decision.
It’s simple and clear
Types of Decision Tree:
Regression tree:
Each node stops splitting when it reaches a limit meaning further splits will have less than n number of observations.
Classification tree:
Decision Tree Terminologies:
How Does a Decision Tree Work?
Decision Tree Working Procedure:
For the next node, the algorithm again compares the attribute value with the other sub-nodes and moves further. It continues the process until it reaches the leaf node of the tree. The complete process can be better understood using the below steps:
Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
Step-3: Divide the S into subsets that contain possible values for the best attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3. Continue this process until a stage is reached where you cannot further classify the nodes and called the final node as a leaf node.
Attribute Selection Measure (ASM)/Attribute to split the data:
Entropy:
Recommended by LinkedIn
Here, P(+) /P(-) = % of +ve class / % of -ve class
NOTE:
A limitation with Entropy: Computation time taken will be very high because of the log function so we use GINI INDEX/GINI IMPURITY
Gini Index/ Gini Impurity:
Where:
p is the proportion of instances in S that belong to class i.
Information Gain:
Where:
Information Gain(S, A) is the information gain of attribute A on set S.
Entropy(S) is the entropy of the set S before the split.
|Sv| is the number of instances in subset Sv after the split.
|S| is the total number of instances in set S.
Entropy(Sv) is the entropy of the subset Sv after the split.
Note - Scaling and Handling Outliers are not required for the Decision tree Algorithm
Advantages of Decision Trees:
a. Interpretability: Decision trees provide a transparent and intuitive representation of the decision-making process, enabling easy interpretation and understanding.
b. Handling Nonlinear Relationships: Decision trees can capture complex nonlinear relationships between features, allowing them to handle datasets with intricate structures.
c. Handling Mixed Data Types: Decision trees can handle both categorical and numerical features, making them versatile for various types of datasets.
d. Feature Importance: Decision trees can assess the importance of features based on their contribution to the decision-making process, allowing for feature selection and interpretation.
Limitations of Decision Trees:
a. Overfitting: Decision trees have a tendency to overfit the training data, capturing noise and irrelevant patterns. Techniques like pruning and setting stopping criteria can mitigate this issue.
b. Lack of Robustness: Small changes in the training data can lead to significantly different tree structures, making decision trees less robust compared to other algorithms.
c. Difficulty with Continuous Data: Decision trees perform better with categorical or binary features compared to continuous variables. Preprocessing techniques, such as discretization or binning, can be used to handle continuous data.
Practical Applications:
a. Medical Diagnosis: Decision trees can help doctors make accurate diagnoses based on patient symptoms, medical history, and test results.
b. Credit Scoring: Banks and financial institutions use decision trees to assess creditworthiness by analyzing factors such as income, credit history, and employment status.
c. Customer Segmentation: Decision trees can segment customers based on their demographic, behavioural, or purchase data, aiding targeted marketing campaigns.
d. Fraud Detection: Decision trees are valuable in identifying patterns of fraudulent activities by analyzing transactional data and user behaviour.
If you learned something from this blog, make sure you give it a 👏🏼
Will meet you in some other blog, till then Peace ✌🏼.
Thank_You_
Principal Instructor-Data Science, Learning Operations at AlmaBetter
1yInteresting!!