Unraveling a Crazy SQL Query: Finding the Most Loyal Customer Step by Step
SQL queries can sometimes feel like solving a puzzle, especially when you’re trying to dig deep into data to uncover something specific. In this article, we’re going to tackle a fun yet slightly wild SQL query scenario: finding the most loyal customer who has ordered the same item the most times in a single year (2024, in our case). We’ll break it down step by step, starting from scratch, making it easy to understand even if you’re new to SQL. By the end, you’ll see how we crafted a “crazy” query that’s both powerful and practical. Let’s dive in!
Step 1: Understand the Problem
Before writing any code, let’s define what we’re trying to achieve:
This sounds like a real-world problem—think of a coffee shop wanting to find out who’s buying their signature latte the most or an e-commerce site tracking repeat purchases of a popular item.
Step 2: Sketch the Database Schema
To make sense of the query, let’s visualize the tables and their relationships:
From this, we can see:
Step 3: Start with the Basics—Joining Tables
To get the customer name, product name, and order details, we need to combine the orders, customers, and products tables. Since orders links customers and products via customer_id and product_id, we’ll use JOIN operations.
Let’s start with a simple query to fetch all orders with customer and product details:
sql
SELECT
c.customer_name,
p.product_name,
o.order_date
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id
WHERE YEAR(o.order_date) = 2024;
Explanation:
Sample Output (based on our example data):
customer_name | product_name | order_date
--------------+--------------+------------
Alice | Latte | 2024-01-05
Alice | Latte | 2024-02-10
Bob | Latte | 2024-03-15
Alice | Espresso | 2024-04-20
Alice | Latte | 2024-05-25
This gives us a raw list of orders, but we need to count how many times each customer ordered each product.
Step 4: Group and Count Orders
To find out how many times each customer ordered each product, we need to group the data by customer and product and count the orders:
sql
SELECT
c.customer_name,
p.product_name,
COUNT(*) as order_count
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id
WHERE YEAR(o.order_date) = 2024
GROUP BY c.customer_id, c.customer_name, p.product_id, p.product_name;
Explanation:
Sample Output:
customer_name | product_name | order_count
--------------+--------------+------------
Alice | Latte | 3
Alice | Espresso | 1
Bob | Latte | 1
Now we can see that Alice ordered a Latte 3 times, an Espresso once, and Bob ordered a Latte once. But we only want the customer(s) with the highest order count (in this case, Alice’s 3 Lattes).
Step 5: Find the Maximum Order Count
To get only the customer-product pair with the most orders, we need to:
Recommended by LinkedIn
This is where the query starts getting “crazy” because we need a subquery to calculate the maximum order count. Let’s modify the query:
sql
SELECT
c.customer_name,
p.product_name,
COUNT(*) as order_count
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id
WHERE YEAR(o.order_date) = 2024
GROUP BY c.customer_id, c.customer_name, p.product_id, p.product_name
HAVING COUNT(*) = (
SELECT MAX(order_count)
FROM (
SELECT COUNT(*) as order_count
FROM orders o2
JOIN customers c2 ON o2.customer_id = c2.customer_id
JOIN products p2 ON o2.product_id = p2.product_id
WHERE YEAR(o2.order_date) = 2024
GROUP BY c2.customer_id, p2.product_id
) max_orders
);
Explanation:
Sample Output:
customer_name | product_name | order_count
--------------+--------------+------------
Alice | Latte | 3
This shows that Alice is our most loyal customer for Lattes, with 3 orders. If another customer also had 3 orders for a different product, they’d appear here too.
Step 6: Add a Touch of Polish
To make the output cleaner and ensure consistent sorting, let’s add an ORDER BY clause to sort by order_count (descending) and then by customer_name and product_name (alphabetically):
sql
SELECT
c.customer_name,
p.product_name,
COUNT(*) as order_count
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id
WHERE YEAR(o.order_date) = 2024
GROUP BY c.customer_id, c.customer_name, p.product_id, p.product_name
HAVING COUNT(*) = (
SELECT MAX(order_count)
FROM (
SELECT COUNT(*) as order_count
FROM orders o2
JOIN customers c2 ON o2.customer_id = c2.customer_id
JOIN products p2 ON o2.product_id = p2.product_id
WHERE YEAR(o2.order_date) = 2024
GROUP BY c2.customer_id, p2.product_id
) max_orders
)
ORDER BY order_count DESC, customer_name, product_name;
Explanation:
Step 7: Why Is This Query “Crazy”?
This query might seem straightforward now, but it’s “crazy” for a few reasons:
Yet, it’s practical and solves a real-world problem—perfect for impressing your data team or debugging a business question!
Step 8: Testing and Debugging Tips
When crafting a query like this, you might hit some snags. Here’s how to debug:
Step 9: Real-World Applications
This query can be adapted for many scenarios:
You could extend it by:
Final Query
Here’s the complete, polished query one more time:
sql
SELECT
c.customer_name,
p.product_name,
COUNT(*) as order_count
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id
WHERE YEAR(o.order_date) = 2024
GROUP BY c.customer_id, c.customer_name, p.product_id, p.product_name
HAVING COUNT(*) = (
SELECT MAX(order_count)
FROM (
SELECT COUNT(*) as order_count
FROM orders o2
JOIN customers c2 ON o2.customer_id = c2.customer_id
JOIN products p2 ON o2.product_id = p2.product_id
WHERE YEAR(o2.order_date) = 2024
GROUP BY c2.customer_id, p2.product_id
) max_orders
)
ORDER BY order_count DESC, customer_name, product_name;