SALES DATA SERVED THREE WAYS (SQL)

View associated files on GitHub

Tools and Languages: SQL, SQL Server (see associated Excel and Power BI projects)

Description and Intent: This analysis of electronics, appliances and accessories sales data from the USA, derived from twelve Kaggle CSV files spanning the entirety of 2019, was conducted to answer the files' associated series of questions (updated slightly to reflect different/additional queries):

The analysis was carried out entirely within SQL (Microsoft SQL Server), with additional intentions to 'serve the data' with two additional options: Excel and Power BI.

Insights and Reporting: This analysis had been conducted previously in Excel, so there was some prior understanding of the data, as well as the option to cross-check calculated values. The six associated fields included:

The 12 CSV files were combined via the 'Union All' function into a new single table. The resulting data was explored in general and rows with blank and generic “Product”, “Order Date” and “Purchase Address” values, as well as 34 rows with dates outside of 2019, were removed. Some column types were converted from 'varchar' to numeric types to enable desired aggregations. 

In addition to the 'Union All' joins, a variety of aggregations were used, as was extraction of datetime components from 'varchar' date strings and substring extractions of locations from 'varchar' address values. CTEs, Subqueries and Case Statements were used at various points.

Despite the initial calculated values being identical to those in the associated Excel analysis, slight differences emerged when analysing the categories of products. This discovery highlighted very clearly the value of examining data from multiple angles, particularly in the absence of other assisting analysts. The SQL queries at this stage were re-examined and found to be calculating totals without the correct use of individual item prices. A second CTE was added and additional adjustments yielded values consistent with the Excel analysis. 

Finally, the posed questions were answered as follows:

Copyright © 2023, Brandon O'Donnell

LinkedInGitHubEmail