This project demonstrates advanced SQL skills that can transform raw data into strategic insights.
CTEs and Window Functions for Business IntelligenceThis project demonstrates my expertise in advanced SQL techniques by analyzing a comprehensive sales database using Common Table Expressions (CTEs) and Window Functions. The analysis transforms raw operational data into strategic insights that drive business decisions.
The analysis was performed on a relational database with the following complete tables:
OrderID | ProductID | CustomerID | SalesPersonID | OrderDate | ShipDate | OrderStatus | ShipAddress | BillAddress | Quantity | Sales | CreationTime |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 101 | 2 | 3 | 2025-01-01 | 2025-01-05 | Delivered | 9833 Mt. Dias Blv. | 1226 Shoe St. | 1 | 10 | 2025-01-01 12:34:56.0000000 |
2 | 102 | 3 | 3 | 2025-01-05 | 2025-01-10 | Shipped | 250 Race Court | NULL | 1 | 15 | 2025-01-05 23:22:04.0000000 |
3 | 101 | 1 | 5 | 2025-01-10 | 2025-01-25 | Delivered | 8157 W. Book | 8157 W. Book | 2 | 20 | 2025-01-10 18:24:08.0000000 |
4 | 105 | 1 | 3 | 2025-01-20 | 2025-01-25 | Shipped | 5724 Victory Lane | NULL | 2 | 60 | 2025-01-20 05:50:33.0000000 |
5 | 104 | 2 | 5 | 2025-02-01 | 2025-02-05 | Delivered | NULL | NULL | 1 | 25 | 2025-02-01 14:02:41.0000000 |
6 | 104 | 3 | 5 | 2025-02-05 | 2025-02-10 | Delivered | 1792 Belmont Rd. | NULL | 2 | 50 | 2025-02-06 15:34:57.0000000 |
7 | 102 | 1 | 1 | 2025-02-15 | 2025-02-27 | Delivered | 136 Balboa Court | NULL | 2 | 30 | 2025-02-16 06:22:01.0000000 |
8 | 101 | 4 | 3 | 2025-02-18 | 2025-02-27 | Shipped | 2947 Vine Lane | 4311 Clay Rd | 3 | 90 | 2025-02-18 10:45:22.0000000 |
9 | 101 | 2 | 3 | 2025-03-10 | 2025-03-15 | Shipped | 3768 Door Way | NULL | 2 | 20 | 2025-03-10 12:59:04.0000000 |
10 | 102 | 3 | 5 | 2025-03-15 | 2025-03-20 | Shipped | NULL | NULL | 0 | 60 | 2025-03-16 23:25:15.0000000 |
CustomerID | FirstName | LastName | Country | Score |
---|---|---|---|---|
1 | Jossef | Goldberg | Germany | 350 |
2 | Kevin | Brown | USA | 900 |
3 | Mary | NULL | USA | 750 |
4 | Mark | Schwarz | Germany | 500 |
5 | Anna | Adams | USA | NULL |
ProductID | Product | Category | Price |
---|---|---|---|
101 | Bottle | Accessories | 10 |
102 | Tire | Accessories | 15 |
103 | Socks | Clothing | 20 |
104 | Caps | Clothing | 25 |
105 | Gloves | Clothing | 30 |
EmployeeID | FirstName | LastName | Department | BirthDate | Gender | Salary | ManagerID |
---|---|---|---|---|---|---|---|
1 | Frank | Lee | Marketing | 1988-12-05 | M | 55000 | NULL |
2 | Kevin | Brown | Marketing | 1972-11-25 | M | 65000 | 1 |
3 | Mary | NULL | Sales | 1986-01-05 | F | 75000 | 1 |
4 | Michael | Ray | Sales | 1977-02-10 | M | 90000 | 2 |
5 | Carol | Baker | Sales | 1982-02-11 | F | 55000 | 3 |
OrderID | ProductID | CustomerID | SalesPersonID | OrderDate | ShipDate | OrderStatus | ShipAddress | BillAddress | Quantity | Sales | CreationTime |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 101 | 2 | 3 | 2024-04-01 | 2024-04-05 | Shipped | 123 Main St | 456 Billing St | 1 | 10 | 2024-04-01 12:34:56.0000000 |
2 | 102 | 3 | 3 | 2024-04-05 | 2024-04-10 | Shipped | 456 Elm St | 789 Billing St | 1 | 15 | 2024-04-05 23:22:04.0000000 |
3 | 101 | 1 | 4 | 2024-04-10 | 2024-04-25 | Shipped | 789 Maple St | 789 Maple St | 2 | 20 | 2024-04-10 18:24:08.0000000 |
4 | 105 | 1 | 3 | 2024-04-20 | 2024-04-25 | Shipped | 987 Victory Lane | NULL | 2 | 60 | 2024-04-20 05:50:33.0000000 |
4 | 105 | 1 | 3 | 2024-04-20 | 2024-04-25 | Delivered | 987 Victory Lane | NULL | 2 | 60 | 2024-04-20 14:50:33.0000000 |
5 | 104 | 2 | 5 | 2024-05-01 | 2024-05-05 | Shipped | 345 Oak St | 678 Pine St | 1 | 25 | 2024-05-01 14:02:41.0000000 |
6 | 104 | 3 | 5 | 2024-05-05 | 2024-05-10 | Delivered | 543 Belmont Rd. | NULL | 2 | 50 | 2024-05-06 15:34:57.0000000 |
6 | 104 | 3 | 5 | 2024-05-05 | 2024-05-10 | Delivered | 543 Belmont Rd. | 3768 Door Way | 2 | 50 | 2024-05-07 13:22:05.0000000 |
6 | 101 | 3 | 5 | 2024-05-05 | 2024-05-10 | Delivered | 543 Belmont Rd. | 3768 Door Way | 2 | 50 | 2024-05-12 20:36:55.0000000 |
7 | 102 | 3 | 5 | 2024-06-15 | 2024-06-20 | Shipped | 111 Main St | 222 Billing St | 0 | 60 | 2024-06-16 23:25:15.0000000 |
Common Table Expressions (CTEs) were used to create modular, readable queries that build upon each other to answer complex business questions.
This query calculates total sales per customer, identifies their last order date, ranks them by sales performance, and segments them into categories for targeted marketing.
/*
=======================================================================
CTE (NON RECURSIVE) MINI PROJECT
=======================================================================
*/
--CTE (Standalone)
--Find the total sales per customer
WITH TotalSalesPerCustomer AS
(SELECT
CustomerID,
SUM(Sales) TotalSales
FROM [Sales].[Orders]
GROUP BY CustomerID
),
--CTE (Multiple Standalone) USE a comma to separate each standalone CTE and do not repeat the with
--Find the last order date for each customers
LastOrderDate AS
(SELECT
CustomerID,
MAX(OrderDate) LastOrderDate
FROM [Sales].[Orders]
GROUP BY CustomerID
--Note, you can use any sql statement within the CTE except the ORDER BY
),
--CTE (Nested)
--Rank the customers based on Total Sales per customer
CustomerRanking AS
(SELECT
CustomerID,
TotalSales,
RANK() OVER(ORDER BY TotalSales DESC) CustomerRank
FROM TotalSalesPerCustomer
),
--Segment the customers based on Total Sales (CTE (Nested) )
CustomerSegmentation AS
(SELECT
CustomerID,
CASE WHEN TotalSales > 110 THEN 'HighSales'
WHEN TotalSales BETWEEN 90 AND 110 THEN 'MediumSales'
ELSE 'LowSales' END Segment
FROM TotalSalesPerCustomer
)
SELECT
C.CustomerID,
C.FirstName,
C.LastName,
T.TotalSales,
L.LastOrderDate,
CR.CustomerRank,
CS.Segment
FROM [Sales].[Customers] C
LEFT JOIN TotalSalesPerCustomer T ON T.CustomerID = C.CustomerID
LEFT JOIN LastOrderDate L ON L.CustomerID = C.CustomerID
LEFT JOIN CustomerRanking CR ON CR.CustomerID = C.CustomerID
LEFT JOIN CustomerSegmentation CS ON CS.CustomerID = C.CustomerID
ORDER BY TotalSales DESC;
This analysis enables the marketing team to:
Window functions were used to perform complex calculations across sets of rows while retaining detailed data.
This comprehensive query provides multiple perspectives on sales performance including product totals, rankings, and comparisons to averages.
/*
========================================
WINDOW FUNCTIONS
========================================
*/
USE SalesDB;
GO
SELECT
OrderID,
ProductID,
OrderDate,
OrderStatus,
Sales,
SUM(Sales) OVER(PARTITION BY ProductID) ProductTotal,
SUM(Sales) OVER(PARTITION BY ProductID, OrderStatus) ProductTotalbyStatus,
SUM(Sales) OVER() TotalSales,
ROUND(CAST(Sales AS FLOAT)/SUM(Sales) OVER() * 100, 2) PercentageOfTotal,
AVG(Sales) OVER() AverageSales,
AVG(Sales) OVER(PARTITION BY ProductID) AverageSalesbyProduct,
MAX(Sales) OVER() HighestSales,
MIN(Sales) OVER() LowestSales
FROM [Sales].[Orders]
This comprehensive view helps business leaders:
Window functions can also be used to identify potential data quality issues, as shown in this duplicate detection query:
--Check whether the table 'OrdersArchive' contains any duplicates
--Checking data quality
SELECT *
FROM (
SELECT
OrderID,
COUNT(OrderID) OVER(PARTITION BY OrderID) IDCheck
FROM [Sales].[OrdersArchive]
) T
WHERE IDCheck != 1
Maintaining data integrity is crucial for accurate reporting. This query helps:
Additional window function implementations demonstrating advanced analytical capabilities:
--Rank each of the orders based on their sales from highest to lowest
SELECT
OrderID,
OrderDate,
Sales,
RANK() OVER(ORDER BY Sales DESC) SalesRank
FROM [Sales].[Orders]
--Calculate the moving average of sales for each product over time
SELECT
OrderID,
ProductID,
OrderDate,
Sales,
AVG(Sales) OVER(PARTITION BY ProductID) AverageSales,
AVG(Sales) OVER(PARTITION BY ProductID ORDER BY OrderDate) MovingAverage
FROM [Sales].[Orders]
--Find the deviation of each sales from the minimum and maximum sales amount
SELECT
OrderID,
Sales,
MAX(Sales) OVER() HighestSales,
MIN(Sales) OVER() LowestSales,
ABS(Sales - MAX(Sales) OVER()) SalesDeviationFromMax,
(Sales - MIN(Sales) OVER()) SalesDeviationFromMin
FROM [Sales].[Orders]
These advanced techniques enable: