As for all of the Data Carpentry ecology lessons, this lesson uses the Portal Project Teaching Database. The data is available at http://dx.doi.org/10.6084/m9.figshare.1314459 and the download includes a SQLite database file (portal_mammals.sqlite) as well as three .csv files (species.csv, plots.csv, surveys.csv) that can be imported into SQLite.
Note that the figshare download is an archive (.zip) file that rudely explodes all of the files into your current directory.
See this slide deck as a sample intro for the lesson: SQL Intro Deck
Key points:
If you’ve written up a diagram of the data analysis pipeline (raw data -> clean data -> import and analyze -> results -> visualization), it can be helpful to identify that you’re now somewhere between clean data and analysis.
Tips
(optional) The first lesson includes a brief introduction to data design and choosing database systems. This material expands on the database design in the first section.
As you teach the lesson, it can be helpful to pause and write up the query keywords on the board. This could look like this:
FROM table
WHERE (conditional statement, applies to row values) (AND/OR)
ORDER BY column/FUNCTION(column) (ASC/DESC)
* After 02-sql-aggregation
SELECT column FUNCTION(column) AGGREGATE_FUNCTION(column) FROM table
WHERE (conditional statement, applies to row values) (AND/OR) (IS (NOT) NULL) GROUP BY column HAVING (conditional statement, applies to group) ORDER BY column/FUNCTION(column) (ASC/DESC)
* After 03-sql-joins-aliases
SELECT column FUNCTION(column) AGGREGATE_FUNCTION(column) FROM table JOIN table ON table.col = table.col WHERE (conditional statement, applies to row values) (AND/OR) (IS (NOT) NULL) GROUP BY column HAVING (conditional statement, applies to group) ORDER BY column/FUNCTION(column) (ASC/DESC)
As a bonus, if you can leave this on the board, it translates nicely into
the `dplyr` portion of the `R` lesson, i.e.:
SQL: dplyr:
SELECT column select(col) FUNCTION(column) mutate(col = fcn(col)) AGGREGATE_FUNCTION(column) summarize(col = fcn(col)) FROM table JOIN table ON table.col = table.col WHERE (conditional statement, applies to row values) filter(condition) (AND/OR) (IS (NOT) NULL) is.na() GROUP BY column group_by(col) HAVING (conditional statement, applies to group) ORDER BY column/FUNCTION(column) (ASC/DESC) arrange() ~~~
If you want to try something more active (esp. if you’re teaching SQL in the afternoon!), this is a an interactive activity to try.
SELECT name, name FROM students WHERE height > 66