On Github qlim / sql-intro
This presentation is live at http://qlim.github.io/sql-intro (src)
Structured Query Language is a tool for manipuating data in relational databases
SELECT firstName, lastName, countryCode, city FROM Customers WHERE countryCode="US";
Some examples of database systems that use SQL
Although SQL is standardised, differences do exist between systems
We will be using SQLite in this tutorial but will be using a basic subset of SQL that is broadly universal
In business you might want to:
And of course, databases are at the heart of many mobile and web applications
A relational database is a collection of tables, which have columns and rows
Table: Customers#|customerId|firstName|lastName |countryCode|city |tier|joinDate -+----------+---------+---------+-----------+-----------+----+---------- 1|1 |Andy |Dwyer |US |Pawnee |3 |2013-01-08 2|2 |Bruce |Bullerby |GB |London |1 |2013-01-21 3|3 |Carrie |Bradshaw |US |New York |2 |2013-03-04 4|4 |D'Angelo |Barksdale|US |Baltimore |1 |2013-04-19 ...Table: Sales
# |transactionId|customerId|productId|timestamp |quantity|salesValue --+-------------+----------+---------+----------------+--------+---------- 1 |1 |7 |5 |2013-06-01 09:44|6 |336 2 |2 |4 |5 |2013-04-20 04:37|10 |625 3 |3 |8 |4 |2013-04-12 11:04|16 |1051.2 ...
Our example database has 3 tables:
A SELECT query is used to get information out of a database. It's probably the most common type of query you will write
SELECT customerId, firstName, lastName, countryCode FROM Customers;
You should get output like this:
#|customerId|firstName|lastName |countryCode -+----------+---------+---------+----------- 1|1 |Andy |Dwyer |US 2|2 |Bruce |Bullerby |GB 3|3 |Carrie |Bradshaw |US 4|4 |D'Angelo |Barksdale|US ...Open Access, load database, create blank query with Query Design, switch to SQL view
Write a query that selects the transactionId and timestamp columns from the Sales table
Often you will only want to select rows that match a certain criteria. You can add a WHERE clause to your SELECT query that will specify which rows to include
This example selects only customers whose countryCode is 'US':
SELECT * FROM Customers WHERE countryCode = 'US';
#|customerId|firstName|lastName |countryCode|city |tier|joinDate -+----------+---------+---------+-----------+-----------+----+---------- 1|1 |Andy |Dwyer |US |Pawnee |3 |2013-01-08 2|3 |Carrie |Bradshaw |US |New York |2 |2013-03-04 3|4 |D'Angelo |Barksdale|US |Baltimore |1 |2013-04-19 ...
Conditional expressions should be fairly familiar to you if you've used them elsewhere - e.g. with IF() functions in Excel, although there are some differences
WHERE column = 'text' -- test for equality WHERE column = 2 -- also works for numbers WHERE column > 5 -- greater than is '>', less than is '<' WHERE column <> 8 -- test for inequality WHERE c1 = 'US' AND c2 < 10 -- you can combine conditions with AND WHERE (c1 = 'US' AND -- use parentheses to clarify order of operations c2 < 10) OR c1 = 'GB'
Write a query to select all columns for transactions that occurred prior to 1-Sep-2013.
Hint: write the date as '2013-09-01'
Date handling tends to vary quite a bit between different database systems
Most systems will have specialised functions that make working with dates easier - but these are out of scope for today
You may want to order the results of a query in a certain way. You can specify this using an ORDER BY clause in your SELECT
SELECT * FROM Customers WHERE countryCode = 'US' ORDER BY lastName ASC;
#|customerId|firstName|lastName |countryCode|city |tier|joinDate -+----------+---------+---------+-----------+-----------+----+---------- 1|4 |D'Angelo |Barksdale|US |Baltimore |1 |2013-04-19 2|7 |Gary |Blauman |US |New York |2 |2013-06-03 3|3 |Carrie |Bradshaw |US |New York |2 |2013-03-04 4|1 |Andy |Dwyer |US |Pawnee |3 |2013-01-08 5|5 |Eric |Murphy |US |Los Angeles|3 |2013-04-23 6|8 |Henry |Pollard |US |Los Angeles|2 |2013-06-11 ...
SELECT * FROM Customers WHERE countryCode = 'US' ORDER BY lastName ASC, firstName ASC;
Write a query to select all sales occurring in the month of April 2013, sorted so that the transaction with the largest value comes first. If two transactions have the same value, a transaction with a lower customerId should come first
Hint: you will need to combine 2 conditions with AND to select only sales from April 2013
So far we have just been giving the database a list of columns that we want to see in the output of our query, as-is
However, we can also ask the database to perform calculations on the data as well
SELECT customerId, firstName || ' ' || lastName AS fullName, tier + 10 AS tierPlus10 FROM Customers ORDER BY tierPlus10 DESC;
#|customerId|fullName |tierPlus10 -+----------+----------------+---------- 1|1 |Andy Dwyer |13 2|5 |Eric Murphy |13 3|6 |Francis Urquhart|13 4|3 |Carrie Bradshaw |12 ...
Calculating the months since the customer first joined (this works in Access but not SQLite)
SELECT lastName, DateDiff('m', joinDate, Now()) AS monthsActive FROM Customers ORDER BY monthsActive DESC;
lastName monthsActive ========== ============ Dwyer 15 Bullerby 11 Bradshaw 3 Barksdale 2 ...
You can even include 'IF' functions (again, works in Access but not SQLite)
SELECT IIf(tier = 1, '*' + firstName, firstName) as highlightedName, tier FROM Customers;
highlightedName tier =============== ==== Andy 3 *Bruce 1 Carrie 2 *D'Angelo 1 ...
Write a query to retrieve all sales since June 1, 2013, and include the average unit price for each transaction
A common task is to summarise information from multiple rows in a table into a single row
For example, we may want to know, what is the total value of all sales for each customer?
We can add a GROUP BY clause to our SELECT queries to achieve this
SELECT customerId, Sum(salesValue) as salesValueSum FROM Sales GROUP BY customerId;
#|customerId|salesValueSum -+----------+---------------- 1|1 |4865.0 2|2 |740.6 3|3 |3421.2 ...
Write a query to calculate the total sales and average unit sell price for every customer & product combination
Hint: you will need to use an expression that contains 2 Sum functions to calculate the average unit sell price
Write a query that returns the following for each customer: the number of transactions they have made (call it numOfTransactions) and the value of their largest transaction (call it maxSalesValue)
SELECT s.transactionId, c.customerId, c.lastName, c.countryCode, s.timestamp, s.quantity FROM Sales s JOIN Customers c ON s.customerId = c.customerId WHERE c.countryCode = 'US'; ...
# |transactionId|customerId|lastName |countryCode|timestamp |quantity --+-------------+----------+---------+-----------+----------------+-------- 1 |1 |7 |Blauman |US |2013-06-01 09:44|6 2 |2 |4 |Barksdale|US |2013-04-20 04:37|10 3 |3 |8 |Pollard |US |2013-04-12 11:04|16 4 |4 |3 |Bradshaw |US |2013-12-20 19:15|17 ...
Write a query to select transactionId, productId, the product name and quantity for each transaction that is for a product in the Widget category
We can also join 3 or more tables together by adding more JOIN clauses
SELECT s.transactionId, c.customerId, c.lastName, p.productId, p.name AS productName, s.quantity FROM Sales s JOIN Customers c ON s.customerId = c.customerId JOIN Products p ON s.productId = p.productId;
# |transactionId|customerId|lastName |productId|productName|quantity --+-------------+----------+---------+---------+-----------+-------- 1 |1 |7 |Blauman |5 |Bling |6 2 |2 |4 |Barksdale|5 |Bling |10 3 |3 |8 |Pollard |4 |Moolah |16 4 |4 |3 |Bradshaw |6 |Pip |17 ...
Write a query that outputs for each transaction the following: the first name of the customer, the tier of the customer, the product name and category and the quantity
Modify your previous query so that instead of a line for every transaction, it calculates the total sales in each category for every customer
We've only one covered one specific type of JOIN - called a LEFT INNER JOIN
There are also RIGHT INNER JOINs, LEFT and RIGHT OUTER JOINs, FULL OUTER JOINs, CROSS JOINs and more. Each uses different rules for combining the tables
You can also join a table to itself, or join a table to another multiple times (on different columns)
Here are some other resources you may want to look into: