Querying with Microsoft SQL Server 2016®
SQL is probably the most important thing you will ever learn. That is just a fact of life at this point.
You made it this far so you must already know that Pollack Analytics is your only source that allows you to quickly learn SQL. Then you can show your crush down the street how cool you are with your new SQL skills. Trust us, all the hours you’re losing at the gym to learn SQL really pays off.
Let’s get started!
What is SQL? It stands for Structured Query Language. That roughly translates to “words that help organize and manipulate data.” If you did not already upload the provided data set then please visit this page -> Data
If you are really struggling and don’t have SQL yet then check out this page first -> (Coming Soon!)
Lesson I - The Basics Run the following scripts and get engaged with it. Oh, and the * means “all columns” it saves you time instead of typing every column out. But wait, how do I run the script? The SELECT statement always starts your query and is essentially your column headers. You type out the columns you want to see and SQL Server will know to return those values from your data source. Now for the WHERE clause Run that, do you see what it did? It is like a filter for your data. The WHERE clause tells SQL Server what specific data you want to see. For this example we only wanted to see all of the data that had a Product_Class_ID of 102, it then resulted in 7 rows of data. Just think of this as a filter for now.
To run a section of the script all you have to do is highlight it and then push that nice big button that says “Execute”, with the red exclamation mark. Another way and our personal favorite would be to highlight the text and push F5.SELECT *
FROM PL_DW.dbo.Customer_Master
SELECT *
FROM PL_DW.dbo.Date_Master
SELECT *
FROM PL_DW.dbo.Facility_Master
SELECT *
FROM PL_DW.dbo.Product_Master
SELECT *
FROM PL_DW.dbo.SalesRep_Master
SELECT *
FROM PL_DW.dbo.Sales_Master
The FROM clause tells SQL Server which table(s) too use to get your columns. And yes, tables with an S, we will cover that later.SELECT *
FROM PL_DW.dbo.Product_Master
WHERE Product_Class_ID = 102
Lesson II - Aggregate Functions Let’s use our sales data for this dbo.Sales_Master. Run the following script and then we will review. So many new things! The COUNT function essentially counts or tells your the number of rows there are for the column you identified, in this case the Order_ID column. We also named the column or gave it an alias with the AS. If we didn’t then it would just say “(No column name) which is boring, we’re not about that life, we make things pretty here! I don’t care what anyone says, structure, formatting and readability matter, so just stick with us at Pollack Analytics and you’ll be happy, and so will your boss. #PollackAnalytics On to some more aggregate functions. But wait, what does that dope, green colored text mean? SELECT COUNT(Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master
--RETURN THE TOTAL NUMBER OF BOARDS SOLD
SELECT SUM(Piece_Total) AS Total_Pieces
FROM PL_DW.dbo.Sales_Master
--RETURN THE ORDER WITH THE LARGEST NUMBER OF BOARDS
SELECT MAX(Piece_Total) AS Largest_Order
FROM PL_DW.dbo.Sales_Master
--RETURN THE ORDER WITH THE SMALLEST NUMBER OF BOARDS
SELECT MIN(Piece_Total) AS Smallest_Order
FROM PL_DW.dbo.Sales_Master
--RETURN AVERAGE NUMBER OF BOARDS PER ORDER
SELECT AVG(Piece_Total) AS Smallest_Order
FROM PL_DW.dbo.Sales_Master
You are so inquisitive! If you want to write some notes or comments in your script but you don’t want SQL to do anything with it then you can comment it out. There are two ways to do this. The first way is what’s used above, two little dashes — and you got yourself some comments. We really value comments, you’re spending time on your scripts, why not give a little background on what it is supposed to be doing? Sharing is caring, so share your thoughts through your magical little green comments. Also, you can comment parts of your script with this /* COMMENTS GO HERE */ and then SQL won’t read them. More on that later.
Lesson III - Other Main Clauses I sure wish I had a GROUP BY of friends to learn SQL with… It is kind of cool that we can find the total number of orders and the minimum and maximum value of the boards sold but what if you want to know the totals segmented by values of other columns? If you are wondering the same thing then we have a treat for you. Let’s take an injury timeout and review those commas real quick. Formatting and your writing style for SQL is your choice, and there is no real “correct” way to do it, but everything we write is definitely the cleanest and without a doubt the best way. You will run into people though who say this is wrong, and they will probably smell a little funky. So be a trend setter, follow the cool kids, and just write you SQL scripts like this. It can’t be that wrong if it feels so right, right? Moving on to the GROUP BY clause, this clause tells SQL to aggregate the rows but to also show some columns too, and you GROUP BY the columns you put in your SELECT statement. If you do not do this you will get an error: You will probably encounter this error a lot, but it’s okay, now you know that you just have to align your SELECT with your GROUP BY, it is a piece of cake. Another injury timeout, for some reason people do not want to read these error messages and freak out. It literally tells you the issue. SQL is telling you that the column is not valid in the SELECT list because it is not contained in either an aggregate function or the GROUP BY clause. It tells you! So just take a few minutes and read it and contemplate what the error is trying to tell you before the start to cry. If after a minute you still want to cry, we will allow it. On to the next clause! This one is simple, if you want to sort your data you use an ORDER BY clause. You can sort or ORDER BY on multiple fields and you can also choose if it shows in ascending ASC or descending DESC order, you just have to specify that after the column name, as shown above. The last clause that we will cover in this lesson is the HAVING clause. This one is like the WHERE clause but for aggregate functions. You’ll see that if you write a query that pulled Sales Reps who had at least 15 or more orders that it will error out. This will happen because you cannot use the WHERE clause on aggregate functions. And you will get this… Again, just read this error and you will see what you are doing wrong. The correct query is below. This concludes Lesson III, if we are moving too fast please do go back and review. We try to compound lessons, so you will continue to ingrain these fundamentals into your mind and they will become second nature. If you’re totally freaked out then scroll to the top and run through this again. If you’re at least a 3 out of 10 on confidence then move on to Lesson IV, it will all start coming together. #PollackAnalytics --RETURN THE TOTAL NUMBER OF BOARDS SOLD BY MATERIAL ID
SELECT Material_ID
,SUM(Piece_Total) AS Total_Pieces
FROM PL_DW.dbo.Sales_Master
GROUP BY Material_ID
--RETURN THE MIN & MAX NUMBER OF BOARDS SOLD ON AN ORDER BY MATERIAL ID
SELECT Material_ID
,MAX(Piece_Total) AS Largest_Order
,MIN(Piece_Total) AS Smallest_Order
,AVG(Piece_Total) AS Averge_Order
FROM PL_DW.dbo.Sales_Master
GROUP BY Material_ID
*Disclaimer – If your boss says it is wrong then don’t tell him he smells, and keep any laughter to yourself, just listen to him and do it the wrong way.
Column 'PL_DW.dbo.Sales_Master.Material_ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Grouping sounds like it should also sort your data but SQL has a separate function for that.--FIND THE TOTAL ORDERS SOLD BY EACH SALES REP IN THE DATA SET
SELECT SalesRep_ID
,COUNT(DISTINCT Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master
GROUP BY SalesRep_ID
ORDER BY Total_Orders ASC
--INCORRECT USAGE OF FILTERING AN AGGREGATE -> WHERE CLAUSE
SELECT SalesRep_ID
,COUNT(DISTINCT Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master
WHERE COUNT(DISTINCT Order_ID) > 15
GROUP BY SalesRep_ID
ORDER BY Total_Orders ASC
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
--CORRECT USAGE OF FILTERING AN AGGREGATE -> HAVING CLAUSE
SELECT SalesRep_ID
,COUNT(DISTINCT Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master
GROUP BY SalesRep_ID
HAVING COUNT(DISTINCT Order_ID) > 15
ORDER BY Total_Orders ASC
Lesson IV - Date Functions This lesson will be more of a reference guide for all of your future endeavors, if you don’t believe us check out Microsoft’s official trash site on dates -> Microsoft SQL – Date Functions. They might be the professionals but they do not make it easy to learn things. But we love you, Microsoft, you’re the main bro, don’t ever forget that. Also, this part is not unique at all, at this point in the SQL world this date stuff is essentially general knowledge, but please do enjoy. Those are easy to write if you’re in a pinch, but probably not the best way to go about it. The following examples make more sense for the long-term. This is great if you want to quickly find out a month, week, etc. from your data, but it might not be used that much. Also, you may have noticed the CURRENT_TIMESTAMP function and the lack of a FROM clause. The CURRENT_TIMESTAMP is a built in function that gives you the date you installed SQL… just kidding, it gives you the current date and time, obviously. Since it is a built in function, you’re not referencing a table, so you do not need a FROM clause for SQL to know what to do. This next part is the most useful, it takes a datetime and can separate it into a date and time. Our order data has datetimes so we will continue using that table. There are other more tedious ways to strip your date and time from a datetime, but this is the cleanest and easiest way to do it. Congrats on making it through Lesson IV, it’s quick and annoying but remember to use this as a reference for projects later on in the training. Also, did you see the TOP 100? That function tells SQL to only return the first 100 rows, and we sorted it by Order_Date ascending, since ascending is default.
--THESE TWO FUNCTIONS GIVE YOU MONTH AND YEAR FROM A DATE
SELECT YEAR(Order_Date) AS Year_Number
,MONTH(Order_Date) AS Month_Number
FROM PL_DW.dbo.Sales_Master
--NOTICE THE ABBREVIATION AFTER DATEPART, IT WILL DETERMINES YOUR RESULTS
SELECT DATEPART(yy, CURRENT_TIMESTAMP) AS Year_Number
,DATEPART(qq, CURRENT_TIMESTAMP) AS Quarter_Number
,DATEPART(mm, CURRENT_TIMESTAMP) AS Month_Number
,DATEPART(dd, CURRENT_TIMESTAMP) AS Day_Number
,DATEPART(hh, CURRENT_TIMESTAMP) AS Hour_Number
,DATEPART(mi, CURRENT_TIMESTAMP) AS Minute_Number
SELECT TOP 100
Order_Date
,CONVERT(date, Order_Date) AS Order_Date_Only
,CONVERT(time, Order_Date) AS Order_Time
FROM PL_DW.dbo.Sales_Master
ORDER BY Order_Date
Lesson V - Case Statements
Lesson VI - Joins It is best to just explain joins in detail to really get a grasp on how they work. Unfortunately, if you Google “SQL joins” you’ll get a bunch of Venn diagram looking things that try to show you the different types of joins. They are great, but not really for beginners. As a beginner the syntax is all new, and you’re still grasping basic concepts, so seeing a picture of red circles and a messy script underneath is not what you need to be looking at. What you want is this site! #PollackAnalytics Joins essentially link tables together, usually you’ll have a primary key to link on, but you can link or join on any matching columns. Think of a primary key as a unique ID for each row, an example of this would be the Sales_ID from the SalesRep_Master table. Each sales rep as their own unique ID to differentiate them. Then with this ID we can then link to the Sales_Master table and see the data from the SalesRep_Master table and the Sales_Master table together. It will start to make sense. #1 -> LEFT JOIN Why in the world did you guys make that so long and difficult?! Let’s break down the LEFT JOIN first. This says start with the FROM table or base table and give me all of the Sales Rep IDs that it has, then match that to all of the Sales IDs from the JOIN table and give me the matching records. If the JOIN table has a Sales ID that the base table does not then it will not be seen, and if the base table has a record that the JOIN table does not have then you will see a NULL value for that. Now you can put it all together. The SELECT statement is not new and DISTINCT clause means to give back unique records, no duplicates. The DISTINCT clause probably is not needed here, but we’re showing you new things. Next is our COUNT of Order_ID, again nothing you cannot handle. We then get into our base table and LEFT JOIN. The WHERE clause introduces the IN operator, this allows you to specify a list of values as long as you use parentheses ( ). As a note though, using the IN operator will also exclude any NULL values, but we will cover that later. The rest of the script should be straight forward. #2 -> RIGHT JOIN #3 -> INNER JOIN
The LEFT JOIN clause should be your #1 join as a beginner, it is easy, takes little thinking and will prevent errors. Just to start we will always assume that our table that we use after the FROM will be our “main” table, and we will refer to it as such throughout this training.
The LEFT JOIN links another table to your main table. Let’s say your main table or FROM table is the Sales_Master table and you want to join the SalesRep_Master table to it to get more info about your sales reps for a report on their performance, a LEFT JOIN would work well here. See it in action below.SELECT DISTINCT
B.First_Name
,B.Last_Name
,A.SalesRep_ID
,COUNT(DISTINCT Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master A
LEFT JOIN PL_DW.dbo.SalesRep_Master B
ON A.SalesRep_ID = B.Sales_ID
WHERE Sales_ID IN (101,115,130,131,143,148,165)
AND YEAR(Order_Date) = 2017
GROUP BY
B.First_Name
,B.Last_Name
ORDER BY
B.Last_Name ASC
Well, that’s how you learn. We’re not going to sugar coat anything, but we will walk you through this query, and it will make sense, and you will be happy. Just don’t worry and keep reading, we got you covered.
--JUST THE FROM TABLE AND TABLE YOU ARE JOINING TO
FROM PL_DW.dbo.Sales_Master A
LEFT JOIN PL_DW.dbo.SalesRep_Master B
ON A.SalesRep_ID = B.Sales_ID
--ALL OF THE PIECES BACK TOGETHER AGAIN
SELECT DISTINCT
B.First_Name
,B.Last_Name
,A.SalesRep_ID
,COUNT(DISTINCT Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master A
LEFT JOIN PL_DW.dbo.SalesRep_Master B
ON A.SalesRep_ID = B.Sales_ID
WHERE Sales_ID IN (101,115,130,131,143,148,165)
AND YEAR(Order_Date) = 2017
GROUP BY
B.First_Name
,B.Last_Name
ORDER BY
B.Last_Name ASC
The RIGHT JOIN is really just the opposite of a LEFT JOIN, if you did not already assume. This takes the base table and matches values only to the RIGHT JOIN table.
--RIGHT JOIN EXAMPLE
SELECT DISTINCT
B.First_Name
,B.Last_Name
,A.SalesRep_ID
,COUNT(DISTINCT Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master A
RIGHT JOIN PL_DW.dbo.SalesRep_Master B
GROUP BY
B.First_Name
,B.Last_Name
ORDER BY
B.Last_Name ASC
We are not the biggest fans of the INNER JOIN but it can be really useful if you only want matching records from each table that you are joining together. If the base table has values that the INNER JOIN table does not and vice versa, then those values will not be returned, only if the value is in both tables. This is where the Venn diagram pictures make the most sense, you have two circles, or tables, and you will only get that middle piece where the tables overlap.
--INNER JOIN EXAMPLE
SELECT DISTINCT
B.First_Name
,B.Last_Name
,A.SalesRep_ID
,COUNT(DISTINCT Order_ID) AS Total_Orders
FROM PL_DW.dbo.Sales_Master A
INNER JOIN PL_DW.dbo.SalesRep_Master B
GROUP BY
B.First_Name
,B.Last_Name
ORDER BY
B.Last_Name ASC
Lesson VII - Unions You will want to use a UNION operator when you have two (or more) data tables that you want to combine into one. It is important to understand the difference between a UNION and a JOIN. A JOIN will add columns to a table and a UNION adds rows to a table. That is the fundamental difference between the two. Also, when you combine multiple tables with a UNION they need to have the exact same number of columns and the columns need to be the same data type. You cannot UNION an integer field with a character field, we will cover this in the following examples. Let’s start with combining two different cuts of the same table so it is not confusing. The Product_Master table will provide you with a clean example of this. Wait, couldn’t I have just used a Case statement instead to separate the two Lumber_Types? And what is going on with that Table 1 and Table 2 stuff? You are really catching on quick! That is a phenomenal idea, but let’s review what that output would look like and then we can discuss. Now run both and what do you see? The query above should be listed out in columns and the union should be in rows. Perhaps this isn’t the best example, but we will cover this more in the Projects section. Let’s move on to using a UNION to create a subtotal and totals. You’ll love it, we promise. Subtotals with Unions First, let’s image a use case for this, as this is the most important step when deciding how to craft your SQL scripts. We want to quickly query our data and paste it into an Excel document to create a report. We know once the data is in Excel we can create subtotals and format it. But what if we want to take a step back and have SQL create the subtotals for us and then have Excel use some conditional formatting to save bundles of minutes? Well we can! Let’s check it out. We will use the Sales_Master table and the SalesRep_Master table for this. Did you notice that sweet CONCAT function? That one let’s us combine fields into one. Instead of writing out two columns for the names we just made it one. And you may notice space we inserted as well, this is tell SQL to add a space, otherwise it would just combine the two names without any space. Neat stuff. --FIRST UNION EXAMPLE
SELECT
'Table 1' AS Table_Type
,SUM(Product_Cost) / COUNT(DISTINCT Unit_ID) AS AverageCost
FROM PL_DW.dbo.Product_Master
WHERE Product_Class = 'BOARD'
AND Lumber_Type = 'Pressure Treated Pine'
UNION ALL
SELECT
'Table 2' AS Table_Type
,SUM(Product_Cost) / COUNT(DISTINCT Unit_ID) AS AverageCost
FROM PL_DW.dbo.Product_Master
WHERE Product_Class = 'BOARD'
AND Lumber_Type = 'Composite'
SIDE NOTE! The below query is bit long, and while we assume you are using Chrome as your browser (cause it is the best) it really displays long queries the best. So if it looks a little funky let Google creep on your browsing history and start using Chrome! We love Chrome and we like to think Chrome loves us back…--UNION VS CASE STATEMENT
SELECT
SUM(CASE WHEN Lumber_Type = 'Pressure Treated Pine' THEN Product_Cost ELSE 0 END)
/ COUNT(DISTINCT CASE WHEN Lumber_Type = 'Pressure Treated Pine' THEN Unit_ID ELSE NULL END) AS AverageCost_PressureTreated
,SUM(CASE WHEN Lumber_Type = 'Composite' THEN Product_Cost ELSE 0 END)
/ COUNT(DISTINCT CASE WHEN Lumber_Type = 'Composite' THEN Unit_ID ELSE NULL END) AS AverageCost_PressureTreated
FROM PL_DW.dbo.Product_Master
WHERE Product_Class = 'BOARD'
--CREATING SUBTOTALS WITH UNIONS
SELECT
CONCAT(B.First_Name,' ', B.Last_Name) AS SalesRepName
,YEAR(Order_Date) AS OrderYear
,CAST(MONTH(Order_Date)AS NVARCHAR(15)) AS OrderMonth
,COUNT(DISTINCT Order_ID) AS OrderTotal
,SUM(Revenue) AS RevenueTotal
FROM PL_DW.dbo.Sales_Master A
LEFT JOIN PL_DW.DBO.SalesRep_Master B
ON A.SalesRep_ID = B.Sales_ID
WHERE YEAR(Order_Date) = 2016
GROUP BY
CONCAT(B.First_Name,' ', B.Last_Name)
,YEAR(Order_Date)
,CAST(MONTH(Order_Date)AS NVARCHAR(15))
UNION
SELECT
CONCAT(B.First_Name,' ', B.Last_Name) AS SalesRepName
,YEAR(Order_Date) AS OrderYear
,'Total' AS OrderMonth
,COUNT(DISTINCT Order_ID) AS OrderTotal
,SUM(Revenue) AS RevenueTotal
FROM PL_DW.dbo.Sales_Master A
LEFT JOIN PL_DW.DBO.SalesRep_Master B
ON A.SalesRep_ID = B.Sales_ID
WHERE YEAR(Order_Date) = 2016
GROUP BY
CONCAT(B.First_Name, ' ', B.Last_Name)
,YEAR(Order_Date)
Lesson VIII - Subqueries
Lesson IX - Creating Tables & Views
Lesson X - Update & Alter Tables
Lesson XI - Stored Procedures
Lesson XII - Functions
Lesson XIII - Metadata
We are always adding and improving! Please check back soon for our updates!