Lecture 01: Introduction

1/24/23

📋 Lecture Outline

  • Course Mechanics
    • 🧱 Structure
    • 🎯 Objectives
    • 🏋 Expectations
    • 🤝 Ethics
    • 💾 Software
  • Course Content
    • Why statistics?
    • What is an archaeological population?
    • A note about terminology and notation
    • Statistical programming with
    • Literate programming with Quarto

Course Mechanics

🧱 Course Structure

  • Meetings are online every Tuesday from 2:00 to 5:00 PM MST.
  • Meeting structure:
    • homework review and lecture (80 minutes),
    • break (10 minutes), and
    • lab (90 minutes).
  • Course work:
    • lab and homework exercises due every Monday before class by 9:00 PM MST, and a
    • term project.
  • All course materials will be made available on the course website.
  • All graded materials will be submitted through Canvas.

🎯 Course Objectives

Students will develop programming skills by learning how to:

  • import and export data,
  • wrangle (or prepare) data for analysis,
  • explore and visualize data, and
  • build models of data and evaluate them.

And students will gain statistical understanding by learning how to:

  • formulate questions and alternative hypotheses,
  • identify and explain appropriate statistical tools,
  • report the results of analysis using scientific standards, and
  • communicate the analysis to a general audience.

🏋 Course Expectations

Learning is a lot like moving to a new city. You get lost, you get frustrated, you even get embarrassed! But gradually, over time, you come to know your way around. Unfortunately, you’ll only have four months in this new city, so we need to be realistic about what we can actually achieve here.


You won’t become fluent in R, markdown, or statistics, but…

you will gain some sense of the way things tend to go with those languages.

🤝 Course Ethics

All course policies and other University requirements can be found in the course syllabus. They are very, very thorough, so rather than enumerate them all, let’s just summarize them this way:

  • There are many ways to be a bully. Don’t be any of them.
  • And if you see someone getting bullied, do something about it.

💾 Software

The primary statistical tools for this class are

We will go over how to install each of these during our first lab.

Course Content

Why statistics?


We want to understand something about a population.

We can never observe the entire population, so we draw a sample.

We then use a model to describe the sample.

By comparing that model to a null model, we can infer something about the population.

What population does archaeology study?

A note on terminology and notation

  • A statistic is a property of a sample.

    “We measured the heights of 42 actors who auditioned for the role of Aragorn and took the average.”

  • A parameter is a property of a population.

    “Human males have an average height of 1.74 meters (5.7 feet).”

    Note: Parameters are usually capitalized.

population sample
Size N n
Mean μ
Standard Deviation σ s
Proportion P p
Correlation ρ r

Why ?

R is free software under the terms of the Free Software Foundation’s GNU General Public License.

R will run on any system: Mac OS, Windows, or Linux.

R lets you exploit the awesome computing powers of the modern world. It also provides an elegant and concise syntax for writing complex statistical operations.

R users can write add-on packages that provide additional functionality. Here are a few of my favorites.

R offers a lot of tools to produce really, really impressive graphics. For example, here is a simple plot of a normal distribution:

R facilitates reproducible research in two ways. First, it forces you to declare explicitly each step in your analysis.

# take the mean
mean(my_data)

# take the standard deviation
sd(my_data)

Second, it makes R code shareable. In the simplest case, we use R scripts, but we can also use Quarto, a much more flexible tool for writing, running, and explaining R code.

R is also an incredibly active and growing community.

Literate programming with markdown

Markdown is a lightweight markup language for creating formatted text using a plain-text editor. From the Wikipedia page.

INPUT


This is a sentence in Markdown, containing `code`, **bold text**, and *italics*.

OUTPUT


This is a sentence in Markdown, containing code, bold text, and italics.

Quarto = Markdown + R

Quarto allows you to run code and format text in one document.

INPUT


This is an example of Quarto with markdown __syntax__ 
and __R code__.

```{r}
#| fig-width: 4
#| fig-asp: 1
#| fig-align: center

fit <- lm(dist ~ speed, data = cars)

par(pty = "s")

plot(cars, pch = 19, col = 'darkgray')
abline(fit, lwd = 2)
```

OUTPUT


This is an example of Quarto with markdown syntax and R code.