How to select a machine learning model for your particular task

Rao Babar Ali

E-commerce Expansion Strategist | Marketplace Optimization Expert | Data-Driven Growth

Published Aug 1, 2019

What I want to talk about today is - how to select a machine learning model for your particular task and why simple models would come in handy much more often then the complex ones. Usually there are 2 types of problems, that could be solved with the help of machine learning: classification and regression (not going to dive deeper into segmentation or extraction problems today).

So my personal rule of thumb is to start with the simplest models, such as: linear regression, when it comes to regression problems or naive Bayes/logistic regression when it come to classification problems. Why so? There is no much correlation between complex models and a better performance. Often simple models would outperform even deep learning models, especially with a good fine-tuning and boosting techniques. But when starting with a simple models you would greatly save your time, because simple models usually don't need large datasets to train and validate, don't need much time to fine-tune them, they are much easier to implement. For example: linear regression. Let's say, you have some features and you need to find out function of a dependent continuous variable (price, income etc.). This model is so simple: you need to find coefficients, let's say Q so that performing dot product on them by your features X would lead to a nicely fitted strait line to the real value of your dependent variable Y. All the training of linear regression does is finding this coefficients Q. You can find them with the help of gradient descent, or if you don't have much features - normal equation. This is it's formula:

Dot product of X transposed by X, then inverse the result and apply dot product on this result by product of X transposed by y(real values of dependent variable) Q = (XTX)-1XTY

Hamza Nasir 🚀

🔍 Sr. Data Engineer (6+ Years of Experience) ❯ Data Lakes & Warehousing ❯ Big Data & ETL ❯ PySpark ❯ SQL ❯ Python ❯ Kafka ❯ Databricks ❯ AWS & Azure ❯ Writes @ BigDataLad.com

A great read. I do this stuff similary. However, I personally prefer to go 'a little' extra and check the model complexity. It often proves beneficial.

How to select a machine learning model for your particular task

Rao Babar Ali

E-commerce Expansion Strategist | Marketplace Optimization Expert | Data-Driven Growth

More articles by Rao Babar Ali

Insights from the community

Others also viewed

Decision Tree in Machine Learning

Evaluating The Performance Of Classification Models

What is Feature Scaling?

The Hidden Art of Machine Learning: Patterns in the Confusion Matrix

Unraveling the Essence of Loss Functions: Real-World Insights and Applications

Why Calculate Accuracy and AUC both in ML Experiment?

Understanding the Confusion Matrix

FEATURE SELECTION IN ML.

Which machine learning technique is best suited for a classification problem where the output is discrete categories?

Discover Sensible Machine Learning: An introduction to Intelligent Automated Forecasting

Explore topics