Lecture 7
Dr. Elijah Meyer + Konnie Huang
Duke University
STA 199 - Fall 2022
September 19, 2022
Open your ae-06
project in RStudio (that you already started on Tuesday), render your document, and commit and push.
Any questions from prepare materials? Go to slido. You can also upvote others’ questions.
Lab 02 Due Tonight.
Start Early. Render Often. Ask Questions.
Groups are coming after Exam 1.
– Understand pivot_longer
– ggplot practice
– Practice re-creating graphs
– “New” functions: if_else / scale_continuous_x
What makes a dataset “tidy”?
03:00
There are three interrelated rules that make a dataset tidy:
Each variable is a column; each column is a variable.
Each observation is row; each row is an observation.
Each value is a cell; each cell is a single value.
ae-06
ae-06
(repo name will be suffixed with your GitHub name).When pivoting longer, variable names that turn into values are characters by default. If you need them to be in another format, you need to explicitly make that transformation, which you can do so within the pivot_longer()
function.
You can tweak a plot forever, but at some point the tweaks are likely not very productive. However, you should always be critical of defaults (however pretty they might be) and see if you can improve the plot to better portray your data / results / what you want to communicate.
pivot_wider()
which makes datasets wider by increasing columns and reducing rows. pivot_wider()
has the opposite interface to pivot_longer(): we need to provide the existing columns that define the values (values_from) and the column name (names_from).