2/7/23
Visualize the amount of some variable across categories, represented using length or height of bars.
Often easier to read when oriented horizontally.
Grouped bar chart can represent higher dimensional data.
Although this graph is not terribly informative…
Visualize the approximate distribution of a continuous random variable.
Obtained by counting the number of observations that fall into each interval or “bin.”
The shape of the distribution depends on the bin width.
Generally, a bad idea to use stacked or dodged groupings in a single histogram.
Better to use facets.
Visualize the approximate distribution of a continuous random variable.
Procedure:
The kernels in this figure are not to scale.
There’s not a simple answer for how to plot multiple KDE’s, but facets are your friend.
Visualize the approximate distribution of a continuous random variable without having to specify a bandwidth.
Consider this sample:
(0.3, 2.0, 3.4, 1.2, 2.2, 1.9)
.
To calculate its eCDF, we divide the number of observations that are less than or equal to each unique value by the total sample size.
These are a little bit harder to interpret. Gives the probability of being less than or equal to x. E.g., the probability of being 28 years old or younger is 0.5.
Visualize the approximate distribution of a continuous random variable using its quartiles.
Useful for plotting distributions across multiple groups.
When the data are ordered from smallest to largest, the quartiles divide them into four sets of more-or-less equal size. The second quartile is the median!
Sometimes easier to read when oriented horizontally.