3/28/23
\[y = E(Y) + \epsilon\] \[E(Y) = \mu = \beta X\]
\[y = E(Y) + \epsilon\] \[E(Y) = g(\mu) = \beta X\]
\[Var(\epsilon) = \phi \mu\]
where \(\phi\) is a scaling parameter, assumed to be equal to 1, meaning the variance is assumed to be equal to the mean.
\[\beta X = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots + \beta_n x_n\]
Distribution | Name | Link | Mean |
---|---|---|---|
Normal | Identity | \(\beta X = \mu\) | \(\mu = \beta X\) |
Gamma | Inverse | \(\beta X = \mu^{-1}\) | \(\mu = (\beta X)^{-1}\) |
Poisson | Log | \(\beta X = ln\,(\mu)\) | \(\mu = exp\, (\beta X)\) |
Binomial | Logit | \(\beta X = ln\, \left(\frac{\mu}{1-\mu}\right)\) | \(\mu = \frac{1}{1+exp\, (-\beta X)}\) |
Assume length is Gaussian with
\(Var(\epsilon) = \sigma^2\)
\(E(Y) = \mu = \beta X\)
Question What is the probability that we observe these data given a model with parameters \(\beta\) and \(\sigma^2\)?
Location of residential features at the Snodgrass sites
Inside wall | Outside wall | Total | |
---|---|---|---|
Count | 38 | 53 | 91 |
Probability | 0.42 = \(\frac{38}{91}\) | 0.58 = \(\frac{53}{91}\) | 1 |
Odds | 0.72 = \(\frac{0.42}{0.58}\) | 1.40 = \(\frac{0.58}{0.42}\) | |
Log Odds | -0.33 = log(0.72) | 0.33 = log(1.40) |
Why Log Odds?
Because the distribution of odds can be highly skewed, and taking the log normalizes it (makes it more symmetric).
Location inside or outside of the inner wall at the Snodgrass site is a Bernoulli variable and has expectation \(E(Y) = p\) where
\[p = \frac{1}{1 + exp(-\beta X)}\] This defines a logistic curve or sigmoid, with \(p\) being the probability of success. This constrains the estimate \(E(Y)\) to be in the range 0 to 1.
Taking the log of \(p\) gives us
\[log(p) = log\left(\frac{p}{1 - p}\right) = \beta X\]
This is known as the “logit” or log odds.
Question What is the probability that we observe these data (these inside features) given a model with parameters \(\beta\)?
Estimated coefficients:
\(\beta_0 = -8.6631\)
\(\beta_1 = 0.0348\)
For these, the log Likelihood is
\(\mathcal{l} = -28.8641\)
Estimated coefficients:
\(\beta_0 = -8.6631\)
\(\beta_1 = 0.0348\)
Note that these coefficient estimates are log-odds! To get the odds, we take the exponent.
\(\beta_0 = exp(-8.6631) = 0.0002\)
\(\beta_1 = exp(0.0348) = 1.0354\)
For a one unit increase in area, the odds of being in the inside wall increase by 1.0354.
Estimated coefficients:
\(\beta_0 = -8.6631\)
\(\beta_1 = 0.0348\)
To get the probability, we can use the mean function (also known as the inverse link):
\[p = \frac{1}{1+exp(-\beta X)}\]
For a house structure with an area of 300 square feet, the estimated probability that it occurs inside the inner wall is 0.8538.
Proportion of Roman pottery is a binomial variable and has expectation \(E(Y) = p\) where
\[p = \frac{1}{1 + exp(-\beta X)}\] This defines a logistic curve or sigmoid, with \(p\) being the proportion of successful Bernoulli trials. This constrains the estimate \(E(Y)\) to be in the range 0 to 1.
Taking the log of \(p\) gives us
\[log(p) = log\left(\frac{p}{1 - p}\right) = \beta X\]
This is known as the “logit” or log odds.
Question What is the probability that we observe these data (these proportions) given a model with parameters \(\beta\)?
Estimated coefficients:
\(\beta_0 = -1.1818\)
\(\beta_1 = -0.0121\)
For these, the log Likelihood is
\(\mathcal{l} = -4.1148\)
\[D = -2\,\Big[\,log\,\mathcal{L}(M_1) - log\,\mathcal{L}(M_S)\,\Big]\]
\[ \begin{aligned} \chi^2 &= -2\,log\,\frac{\mathcal{L}(M_0)}{\mathcal{L}(M_1)}\\\\ &= -2\,\Big[\,log\,\mathcal{L}(M_0) - log\,\mathcal{L}(M_1)\,\Big] \end{aligned} \]