What I learn from the "Foundations: Data, Data, Everywhere" course
Alhamdulillah! Successfully I have completed the first course of the Google Data Analytics Professional Certificate. What I learn from this fundamental course “Foundations: Data, Data, Everywhere”, I just share with you all.
In the first week, I have learned about Data, Data Scientist, Data Analyst, Data Analysis, Data Analytics and Data analytical process. we know that Data is a collection of facts or information, through a dataset a data specialist called a data analyst, could draw a conclusion and make predictions and decisions analysing data. The data-driven decision is a strategical concept to drive a business correctly. It is defined as using the information to guide business strategy.
The confusing term is the difference between the data scientist and data analyst, which I understand through a simple definition. A data scientist creates a new way of modelling and understanding the unknown by unknown using raw data. On the other hands, a data analyst finds the answer to the existing question to insights from the data sources.
Furthermore, the data analysis and data analytics sound the same, but both are different. Data analytics is in simple terms is the science of data, which is a broad concept they are formed of everything. On the other side, Data analysis is the collection, transformation, organization of data to conclude, make prediction and decision. There are six phases of the data analysis process: Ask, prepare, process, analysis, share, act. Let’s have a simple discussion about six phases:
Ask: An analyst try to understand what type of project and what would be the outcome of a project by asking effective questions. To determine these things, they asked those people who could help him to solve this problem such as leaders or manager.
Prepare: After completing the previous step, the analyst needs to make a plan. They need to define a timeline for this project and start to collect the facts base on the asked question. The most important fact in this phase is, define the timeline and collecting data.
Process: After collecting data we need to check out the dataset, is there missing information. We need to manage and store the data in a protective way. The cleaning process is the most significant task in this phase.
Analyze: Here is the main task of the analyst. They need to show experience, knowledge and creativity. And analyse all of the facts to find all of the answers to the asked question.
Share: In this phase, analyst share their outcome with stakeholder and make a productive conversation with team members of a project. And conclude.
Acts: At the last stage, the analyst and other team member try to implement on their company what they got data-driven decisions.
Moreover, an ecosystem is a group of elements that interact with one another. Similar way, the data ecosystems is a group of elements that interact with each other to produce, manage, store, organize, analyze and share data.
We already know the general life cycle of data analysis:
Ask: Taking a business challenge, Objective and Question.
Prepare: Generating data, collecting, storing and managing.
Process: Data cleaning/ data integrity.
Analyze: Data exploration, visualization, and analysis
Share: Communicating and interpreting the result
Act: Using the analytical decision to solve the problem
Let’s drive into a few popular data life cycles:
EMC’s data analysis life cycle: Discovery -> pre-processing data -> Model planning -> Model Building -> Communicate result.
SAS’s iterative life cycle: Ask -> Prepare -> Explore -> Model -> Implement -> Act -> Evaluate
Project-based data analysis life cycle: Identifying the problem -> Designing data requirements -> Pre-processing data -> Data Analysis -> Data visualizing
Big data analysis life cycle: Business case evaluation -> Data Identification -> Data acquisition and filtering -> Data extraction -> Data validation and cleaning -> Data aggregation and representing -> Data Analysis -> Data Visualization -> Utilization of analysis results.
Data life cycle based on research: Generation -> Collection -> Preprocessing -> Storage -> Management -> Analysis -> Visualization -> Interpretation
In the second week, I have learned about Data analyst skills, Analytical Skills, Data-driven decision.
Analytical skills are qualities and characteristics associated with solving problems using facts. There are five essential skills of the Data analyst: Curious, Understanding context, Having a technical mindset, Data design, Data strategy.
Curious people always take challenges and experiences. They are always hungry to know something new. Understanding Context is a condition where something exists or happens. It can be structure or environment. A technical mindset involves the ability to break down into smaller pieces and work with them orderly and logically. Data design means how to organize your information. The last skill is Data strategy, which is the management of people, process and tools used in data analysis.
Furthermore, analytical thinking involves identifying and defining the problem and then solve it using data in an organized, step-by-step manner. There are five key aspects to analytical thinking: Visualization, Strategy, Problem-orientation, correlation, big-picture and detail-oriented thinking.
Visualization is some graphical representation of information, which help us to understand data. Strategy means what solution they want to achieve and how they can get. Data analyst use Problem-orientation to identify, describe and solve the problem. Correlation is like a relationship between data, the big-picture means sees all of the things in one picture. detail-oriented thinking is to figure out all of the things to execute the plan. Using data we can insight, verify the theory, better understanding the business position, support and making a business plan. The data-driven decision can improve any business in a lot of ways.
In the third week, I have learned the life cycle of data, the outline of the data analysis process and the data analysis toolbox. The data life cycle comprises six steps: Plan, Capture, Manage, Analyze, Archive and Destroy. In those phases, planning is the most significant phase. Because we need to understand what type of data we need to collect, how we could manage throughout the life cycle, who will be responsible for it and optimal outcome. The next phase is capture, where we need to collect data from sources. Another important phase is managing, in this phase, we need to store the data properly and make it secure. The next one is the Analyze, without analyzing anything we can not draw a conclusion. In this phase, we need to solve the problem. Archiving means storing data in a place where it's still available, but may not be used again. Lastly, we need to destroy data because of protecting data.
Then, they would discuss the data analysis process, which they discuss before. In data analysis, most commonly used spreadsheets, query languages and visualization. There are a lot of spreadsheets. But two popular option is Microsoft spreadsheet and Google Spreadsheet. The Query language is a computer programming language that allows you to retrieve or manipulate data from a database. The most popular query language is SQL. The database is the collection of data store in a computer system. And, we used the visualization technique, which helps me to understand the dataset easily. The visualization is a graphical representation of data. Graphs, maps and tables are examples of visualization. The most popular visualization tools are Tableau and Looker. Bost is useful for a data analyst.
In the fourth week, they pretty simple introduction about spreadsheet and SQL. on October 17th, 2019, Google celebrated the 40th anniversary of the computer spreadsheet. The first name of the computer spreadsheet is the VisiCalc. A spreadsheet comprised a robust number of cell and all of the cell identified by column and row. In the dataset, The column labels are usually called attributes. An attribute is a characteristic or quality of data. On the other hand, row called observation. An observation includes all of the attributes for something contained in a row of a data table. We could recognize lots of others way to work on a spreadsheet including function and formula. Moreover, SQL can do lots of the same things with data that data spreadsheet can. You can use It to store, organize, analyze the data, among other things. At last, they introduce Qwiklabs.
In the fifth and last week, they talked about why we are learning these skills. Now I know, there are a load of companies who need to analyze the business including technology, marketing, finance, health care etc to improve their business. That’s why they need a data analyst. Additionally, they talk about the importance of fair business decisions, where they discuss the companies issues. An issue is a topic and subject to investigate the answer. A question is designed to discover the information and the problem is an obstacle or complication that needs to work out. Next, they discuss how we could find out our dream job. For this, we need to focus on some common factors: which industry I like to work in, which tools essential for me, appropriate location, travel facility and culture. All of the factors are significant for a data analyst.
There are lots of job titles of a data analyst,
- Business Analyst: analyses data to improve the business process, product, or service.
- Business Intelligence Analyst: analyzes data for finance or market insight.
- Data Analytics Consultant: analyzes the system or model for using data.
- Data Engineer: prepare and integrates data from different sources for analytical use.
- Data Scientist: uses expert skills in technology and social science to find trends through data analysis.
- Data Specialist: organize or convert data for use in a database or software system.
- Operations Analyst: Analyzes data to access the performance of business operations and workflows.
On the other hands, industry-specific specialist positions that you might come across in your data analyst job search include:
- Marketing analyst: Analyse the market condition from sales data or service.
- HR: Analyzes payroll data for inefficiencies and errors
- Financial analysis: Analysis of the financial status, monitoring, and reviewing data
- Risk Analysis: Analysis of financial documents, economical conditions, clint data to determine the level of risk of a company.
- Healthcare analysis: analyzes medical data to improve the business aspect of hospitals and medical facilities.
Finally, I win the certificate by taking quiz tests.
Now I am waiting to start the second course. In-sha-Allah I will share what I learn.