1/24/23
Students will develop programming skills by learning how to:
And students will gain statistical understanding by learning how to:
Learning is a lot like moving to a new city. You get lost, you get frustrated, you even get embarrassed! But gradually, over time, you come to know your way around. Unfortunately, you’ll only have four months in this new city, so we need to be realistic about what we can actually achieve here.
You won’t become fluent in R, markdown, or statistics, but…
you will gain some sense of the way things tend to go with those languages.
All course policies and other University requirements can be found in the course syllabus. They are very, very thorough, so rather than enumerate them all, let’s just summarize them this way:
The primary statistical tools for this class are
We will go over how to install each of these during our first lab.
We want to understand something about a population.
We can never observe the entire population, so we draw a sample.
We then use a model to describe the sample.
By comparing that model to a null model, we can infer something about the population.
A statistic is a property of a sample.
“We measured the heights of 42 actors who auditioned for the role of Aragorn and took the average.”
A parameter is a property of a population.
“Human males have an average height of 1.74 meters (5.7 feet).”
Note: Parameters are usually capitalized.
population | sample | |
---|---|---|
Size | N | n |
Mean | μ | x̄ |
Standard Deviation | σ | s |
Proportion | P | p |
Correlation | ρ | r |
Why R?
R is free software under the terms of the Free Software Foundation’s GNU General Public License.
R will run on any system: Mac OS, Windows, or Linux.
R lets you exploit the awesome computing powers of the modern world. It also provides an elegant and concise syntax for writing complex statistical operations.
R users can write add-on packages that provide additional functionality. Here are a few of my favorites.
R offers a lot of tools to produce really, really impressive graphics. For example, here is a simple plot of a normal distribution:
R facilitates reproducible research in two ways. First, it forces you to declare explicitly each step in your analysis.
Second, it makes R code shareable. In the simplest case, we use R scripts, but we can also use Quarto, a much more flexible tool for writing, running, and explaining R code.
R is also an incredibly active and growing community.
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. From the Wikipedia page.
INPUT
This is a sentence in Markdown, containing `code`, **bold text**, and *italics*.
OUTPUT
This is a sentence in Markdown, containing code
, bold text, and italics.
Quarto allows you to run code and format text in one document.
INPUT
OUTPUT
This is an example of Quarto with markdown syntax and R code.