Introduction to Data Science – A Summary

Introduction to Data Science – A Summary

Source

This summary is derived from the following article: https://www.heavy.ai/learn/data-science.

Definition

In this era, there are tremendous amounts of both structured and unstructured data that is required to be processed. These managed data, then, can bring many benefits for the future consideration. Furthermore, it is evident that almost everything we encounter currently apply a lot of data ranging from simple thing such as talking to a personal assistant to self-driving car. All of these are made possible by Data Science.

Data Science is a procedure in which the numerous raw data are mined which then the pattern is being analyzed and a meaningful observation is derived. It is known as an interdisciplinary of several fields including computer science, statistics, predictive analytics, inference, machine learning algorithm, and big data.

In practice, data science has its own life cycle in order to optimize project management. Capturing the data is the first stage, which consists of obtaining the data, extracting it, and inputting it into the system. The subsequent step is maintenance, which includes data warehousing, data cleansing, data processing, data staging, and data architecture. In fact, the data processing stage takes 60 to 70 per cent of the time because the gained data are barely in a corrected, structured, noiseless form. Hence, to yield nice data, transformation and sampling of data, checking both the features and observations, using statistical techniques to remove noise, examining the features in the dataset, and handling missing values are required.

Following them is data exploration, in which data scientists stand apart from data engineers. In addition, this step also differentiates between data science and data analytics. In data analytics, the questions of the data are already determined, and it only requires a narrower view to find its answer. Meanwhile, data scientists are attempting to develop better questions to gain more insights. In this step, there are several processes: data mining, data classification and clustering, data modelling, and summarizing resulting insights. In the data modelling, the existing data is fitted by implementing machine learning algorithms. The resulting model is then tested by checking its accuracy and adjusting iteratively until the best model is attained.

Subsequently, data analysis step is undergone consisting of exploratory and confirmatory work, regression, predictive analysis, qualitative analysis, and text mining. Finally, to present the obtained observation, data scientist needs to do visualization, data reporting, the use of various business intelligence tools, and assisting businesses, policymakers, and others in smarter decision making.

Data Science Role in the Industry

There are a lot of implementations of data science in many sectors such as health care, business, marketing, banking and finance, and policy work.

In the health care sector, Data science is able to apply the data to monitor and prevent health problems and emergencies. Indeed, there was a big data revolution in health care based on Mc Kinsey in 2018 in which it makes reducing health care spending by 300 billion dollars to 450 billion dollars or 12 to 17 percent of its total cost possible.

When it comes to business, data science and analytics interplay with each other. Data science may assist in comprehending the exact needs of the customers by using the obtained data. One of the examples of this circumstance is when data scientist trains model for search and product recommendation based on customer age, purchase history, past browsing history, income, and other demographics.

There are some differences between data science and business analytics in this case, even though both consist of collecting data, modelling it, and then gleaning insights from the data. Business analysts focus more on business-related problems such as profit and costs. In contrast, a data scientist is more likely to see from a wider viewpoint everything that might influence the business. While business analyst uses traditional statistical theory, data science has applied more sophisticated technology and algorithms.

Data science and business intelligence are a bit different field. Business intelligence creates a dashboard based on the data to solve business question. Meanwhile, data science is more future-facing approach. It explores the data with a more informed decision making and thus it has more open-ended questions including what events happen as well as the reasons of it.

Data science also has a pivotal role in finance. It is an efficient and effective way to detect and prevent fraud as well as to identify problematic patterns in data. In addition, implementing data science in finance might also decrease non-performing assets and acknowledge decreasing trends faster.

Implementing data science for government policymakers can be advantageous for structuring policies to fulfil their constituent's demands and overcome census undercounts. When they have to do an area evacuation based on historical weather patterns, for example, the policymakers can utilize geospatial data science and related data for consideration. Furthermore, data scientists can construct a model by compiling and exploring data sets attained from aircraft, ships, satellites, and radars so that it can generate better models every day. Therefore, it may also assist in forecasting natural disasters more accurately, enhancing plantation governance, averting potential paradise disasters, and identifying excellent evacuation times.

Additionally, data science might support marketing as well. Data science makes it possible to understand consumer behaviour, resulting in consideration of refining the pricing and strategies of the company.  In this sense, data science can explore the data to discover which market is the best for the products or services, thus escalating the profit. Moreover, it can also forecast the best time for offering a particular product or service. Data science can be more than these. It can assist more in the business strategy, such as identifying optimal price points and how many bids for advertising or even what is the best way to get new customers.

Nevertheless, there are a number of ethics that should be considered when applying data science. Firstly, the business should merely take the data they required and no more than that. This data should be protected with the best technology under any circumstances. Transparency and guard privacy should also be promoted by preserving data accumulation. The sensitive data should be protected since it may lead to reputation damage and customer loss.

Why am I interested in learning data science?

There are at least two reasons why I am interested in learning data science.

Firstly, data is considered as the new gold currently since it acts as a raw material to produce other advantages. As we learn previously, by implementing data science, we will be able to analyze it and create a model which then can be implemented for personal advantage or even public benefit. Hence, a person who can understand data science will be able to manage this new gold and its advantages.

Secondly, as time passes, technology develops into a more advanced one. Because upgrading technology is not possible without exploring a tremendous number of data, data science has become essential. In developing automated personal assistant, for instance, necessitates a huge number of data to be learned and it becomes easy by applying data science. Thus, an individual who understand data science can follow and take parts in this technology development.

Therefore, because of these reasons, I have made up my mind to study data science.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics