Day - 6 of SQL for Data Science

Day - 6 of SQL for Data Science

SQL Joins are a crucial aspect of database management and querying. They enable us to combine data from multiple database tables, unleashing the true power of data integration and analysis. Mastering SQL Joins will undoubtedly elevate your data manipulation skills.

Use the ERD Diagram to understand and write queries :

No alt text provided for this image

What is SQL Joins?

SQL Joins are an essential aspect of working with databases and are used to combine data from multiple tables. They allow us to retrieve related data from different tables based on a specified condition, enabling powerful data analysis and insights.

There are several types of SQL Joins :

  • INNER JOIN

The most commonly used join, it returns only the matching rows from both tables. It filters out non-matching rows, leaving you with a result set containing only relevant data.

  • LEFT JOIN (or LEFT OUTER JOIN)

This join returns all the rows from the left table and the matching rows from the right table. If there is no match, it fills in the gaps with NULL values

  • RIGHT JOIN (or RIGHT OUTER JOIN)

Similar to the LEFT JOIN, but it returns all the rows from the right table and matching rows from the left table. Non-matching rows will have NULL values in the result.

  • FULL OUTER JOIN

Combining both LEFT and RIGHT joins, this returns all rows when there is a match in either of the tables. If there's no match, NULL values will fill in the gaps.

ON

With the addition of the JOIN statement to our toolkit, we will also be adding the ON statement.

We use ON clause to specify a JOIN condition which is a logical statement to combine the table in FROM and JOIN statements.

Example

Below we see an example of a query using a JOIN statement. Let's discuss what the different clauses of this query.

SELECT orders.*
FROM orders
JOIN accounts
ON orders.account_id = accounts.id;        

  • The SELECT clause indicates which columns of data you'd like to see in the output (For Example, orders. gives us all the columns in orders table in the output).
  • The FROM clause indicates the first table from which we're pulling data, and the JOIN indicates the second table.
  • The ON clause specifies the column on which you'd like to merge the two tables together.

We are able to pull data from two tables :

  1. orders
  2. accounts

Above, we are only pulling data from the orders table since in the SELECT statement we only reference columns from the orders table.

The ON statement holds the two columns that get linked across the two tables.

Primary and Foreign Keys

  • Primary Key

primary key is a unique column in a particular table. This is the first column in each of our tables. Here, those columns are all called id, but that doesn't necessarily have to be the name. It is common that the primary key is the first column in our tables in most databases.

Foreign Key

foreign key is a column in one table that is a primary key in a different table. We can see in the Parch & Posey ERD that the foreign keys are:

  1. region_id
  2. account_id
  3. sales_rep_id

Each of these is linked to the primary key of another table.

No alt text provided for this image

Note

Notice our SQL query has the two tables we would like to join - one in the FROM and the other in the JOIN. Then in the ON Statement, we will always have the PK equal to the FK.

ALIAS

When we JOIN tables together, it is nice to give each table an alias. Frequently an alias is just the first letter of the table name.

Example

FROM tablename AS t1
JOIN tablename2 AS t2        

Aliases for Columns in Resulting Table

While aliasing tables is the most common use case. It can also be used to alias the columns selected to have the resulting table reflect a more readable name.

Example

Select t1.column1 aliasname, t2.column2 aliasname2
FROM tablename AS t1
JOIN tablename2 AS t2        

Every JOIN we have done up to this point has been an INNER JOIN. That is, we have always pulled rows only if they exist as a match across two tables.

Our new JOINs allow us to pull rows that might only exist in one of the two tables. This will introduce a new data type called NULL.

LEFT and RIGHT JOIN

LEFT JOIN - pulls all the data that exists in both tables, as well as all of the rows from the table in the FROM even if they do not exist in the JOIN statement.

RIGHT JOIN - pulls all the data that exists in both tables, as well as all of the rows from the table in the JOIN even if they do not exist in the FROM statement.


To view or add a comment, sign in

More articles by Mrityunjay Pathak

  • Bias and Variance and Its Trade Off

    There are various ways to evaluate a machine-learning model. Bias and Variance are one such way to help us in parameter…

  • Machine Learning Mathematics🔣

    Machine Learning is the field of study that gives computers the capability to learn without being explicitly…

  • How to Modify your GitHub Profile Readme File as your Portfolio

    What if you don't have a personal portfolio website? No worries! You can transform your GitHub README.md into a…

    4 Comments
  • Data Science Resources

    Are you starting your journey into the world of Data Science? Here's a curated list of top resources to master various…

  • 25 Python Sets Questions with Solution

    25 Python Sets Coding Questions along with Explanations for each. Let's get started ↓ Question 1: Write a Python…

  • 25 Python Tuple Questions with Solution

    25 Python Tuple Coding Questions along with Explanations for each. Let's get started ↓ Question 1: Find the length of a…

  • 25 Python Dictionary Questions and Solutions

    25 Python Dictionary Coding Questions along with Explanations for each. Let's get started ↓ Question 1: Create an empty…

  • 25 Python List Questions with Solution

    25 Python List Coding Questions along with Explanations for each. Let's get started ↓ Question: Given a list nums, find…

    2 Comments
  • 25 Python String Questions with Solution

    25 Python Strings Coding Questions along with Explanations for each. Let's get started ↓ Write a Python program to…

    3 Comments
  • 25 Python Loop Coding Questions

    25 Python Loop Coding Questions along with Explanations for each. Let's get started ↓ Print numbers from 1 to 10 using…

    3 Comments

Insights from the community

Others also viewed

Explore topics