Iris Predictive Data Analysis using Linear Regression
The Iris dataset was collected by an American botanist Edgar Shannon Anderson (November 9, 1897 – June 18, 1969). He collected the data to quantify the morphologic variation of Iris flowers of three related species. The dataset was made famous by the British statistician and biologist Ronald Fisher in his 1936 paper “The use of multiple measurements in taxonomic problems” as an example of linear discriminant analysis.
I was told that the Iris Dataset is the "Hello World" in the field of Data Science. I became curious about it and wanted to see it for myself.
I was interested to know how are the species different from one another, based on their sepal and petal features.
I was also keen to know if the dataset is accurate and balanced in conducting a linear regression model.
I would like to take this time to thank my teachers in Forward College ( Daren Tan , Daniel Tan, Aaron Raj and Leonard Kok) for their patience, heart for teaching, and for sharing their knowledge and expertise in the field of Data Science and Python programming.
Data analytics improved my analytical and problem-solving skills. I am now able to perform data collection, data analysis and statistical modelling in an analytical approach.
I learned that data can tell a story. Being a data analyst is similar to being a detective—tracing details for clues within to find a solution is always rewarding.
I hope you will take a look on my case study and I would appreciate your feedback to help me improve my skills in data analytics.