Why do some human settlements last longer than others?
Inspiration
Is settlement persistence an inherent good?
No, its value derives from the contribution it makes to human life and well-being.
The Lord Baron model
Goal is to maximize \(F\) given trade-off between \(P\)rimary and \(S\)ocial production, with \(A\) controlling the rate at which primary production declines with increased \(R\) (Renfrew and Poston 1979).
\[F(R) = P(R) + S(R)\]
This anticipates many of the ideas that have come to be known as settlement scaling theory (Ortman et al 2014).
Settlement dynamics
Lord Baron assumes that settlement is a dynamic system with multiple, discontinuous equilibrium states.
\(R\) can be thought of as the per capita contribution of an individual to the “public good.” And, the sigmoid shape of \(S\) suggests that everyone gets slightly more out of the village than they put in, especially early on.
Collective Action Problem
Lord Baron assumes per capita costs and benefits, so it can’t account for asymmetric interactions (i.e., free-riders, Tories, the 1%, etc). And, if you can’t get buy-in, the whole system unravels (like a GoFundMe).
This is a problem for Lord Baron as an explanation for the origins of urban agglomerations.
But, what about persistence?!
Read it from left to right, starting with the village equilibrium state.
“Agglomerations, once established, are usually able to survive even under conditions that would not cause them to form in the first place” (Fujita, Krugman, Venables 1999).
Expectations
Agglomerated systems should persist longer than dispersed systems.
Everyone should be “better off” in an agglomeration system, whether they are
profiting off that system or
trapped in it, having no viable alternative.
Study area
Unit of analysis
Discretized spatial and temporal units.
Population
Estimated for each grid cell using Uniform Probability Density Analysis (Ortman 2016).
Duration
Derived by applying threshold to population reconstruction.
Agglomeration
Based on population distribution within travel time \(t\) of a focal grid cell.
If you squint, this looks like a proxy for spatial network centrality.
\(h(t)\) the hazard rate: the number of settlements you can expect to be abandoned at \(t\)given that they persisted up to\(t\).
Discrete-time proportional hazards
The hazard rate gives the expectation for \(T\), which is normally assumed to be continuous. This is implausible in an archaeological context, so we switch to discrete time and model the hazard rate using ordinary logistic regression.
Going to use Random Forest for this because it does not require an assumption about the distribution of \(T\).
\(X\): maximum agglomeration, maximum population, Maize GDD per time step, PPT per time step, initial start date, and region.
To handle spatial autocorrelation, the model also includes the first two principal components derived by applying PCA to the full cost-distance matrix (similar to MESF).
The sampsize = c("0" = round(n/5), "1" = n) argument is an overreaction to the fact that class imbalance is huge in this case.
Hazard rate
For illustration purposes.
Lingering issues
Probably in order of importance…
Need to build population reconstruction using deep learning (Reese 2021) to generate estimates using the entire tree ring record, rather than type sites.
A better way of measuring agglomeration.
Need a more fine-grained climate reconstruction. Currently, can only get to approximately 1-km resolution.
Would like to use standard regression for inference.
Might need to include lags.
Acknowledgments
Matt Peeples
Peter Yaworsky
Weston McCool
Josh Watts
The {extendr} crew
And thanks to Andreas and Eleftheria for organizing!