Linear Regression in Python!
Back in August 2023, I started graduate school and weeks into my program I learned the power of linear regression. Since then, I have used it in most assignments and tasks. As a data enthusiast, I have come to realize linear regression is my best friend to get critical understanding of the data.
Now, I want to share my adventures through short articles on LinkedIn. The goal? To help anyone curious about learning in my LinkedIn community to learn and needs a resource they are somehow deciphering some knowledge from these articles. And to kick things off, let's chat about ice cream sales using Python. Don't worry if you're not a coding expert yet – we'll take it slow.
Understanding Basics:
Linear Regression in simpler terms explores the relationship between a dependent variable (Y or what we would like to predict) and the independent variable (X or features that might help predict). One of the biggest rules of linear regression (lm) is that it is capable of only defining linear relations. For example, let's take Ice Cream Sales as our "dependent variable" and Temperature as our "independent variable".
Below is a code to start Python Code-
# Data_Creation
Data = {'Temperature':[23,30,45,50],'Sales':[200,300,350,400]}
# making 'dataFrame' for the data
IceCream_df = pd.DataFrame(Data)
Linear Regression in Python:
Before we start with actual regression, it is a good habit to always visualize your data to explore hidden intricacies in the data.
import matplotlib.pyplot as plt
plt.scatter(IceCream_df['Temperature'],IceCream_df['Sales'])
plt.title('Ice Cream Sales vs. Temperature')
plt.show()
This is a very basic way to create a scatter plot in the Python using matplotlib library. A scatterplot gives a relation between the sales and temperature immediately.
The scatterplot shows some linear relation between Sales and Temperature. Now let's get into doing linear regression in Python-
Recommended by LinkedIn
from sklearn.linear_model import LinearRegression
X= IceCream_df[['Temperature']]
y= IceCream_df[['Sales']]
model = LinearRegression()
model.fit(X, y)
We're now ready to create our linear regression model. The fit method will find the best-fitting line for our data.
Let's Predict with New Sales Data!!!
Here comes the most exciting part, now that we have our model, we will create new data to predict sales!
newdata = {'Temperature': [32, 38, 42]}
new_df = pd.DataFrame(newdata)
predictions = model.predict(new_df)
print("Predicted Sales:", predictions)
After running the code in my Jupyter Notebook, I got some predictions, which are:
Now, if I was working for an Ice Cream shop as a Data Scientist, I would make business decisions to have promotions or strategize to increase our sales.
While we won't delve into detailed evaluation here, it's crucial in a real-world scenario. Metrics like Mean Squared Error or R-squared help gauge the model's performance. I will put out another article on those metrics.
Conclusion
Linear regression is a powerful tool for predictive modeling, and this simple ice cream sales example showcases its application in a beginner-friendly manner. The journey from data visualization to making predictions highlights the step-by-step process. Remember, linear regression is just the tip of the iceberg in the vast realm of data science. It opens doors to more complex models and exciting possibilities.
If you would like to follow more coding projects, you can take a peek at my projects on GitHub.
Senior Business Analyst | MS in Business Analytics, UC Davis | Passionate about AI & Machine Learning Innovation
1yGood job Maanve!