The following slides are based on
Number of records per value bin
Bin size must be chosen appropriately
Here numbers of passengers on Titanic per age band, bin size 5
(a) 1 year: too small
(b) 3 years: ok
(c) 5 years: ok
(d) 15 years: too large
Used for continuous values
Better, when there are many data points
Sometimes matter of taste what fits better
(a) gaussian kernel, bandwidth = 0.5: too peaky
(b) gaussian kernel, bandwidth = 2: ok
(c) gaussian kernel, bandwidth = 5: ok
(d) rectangular kernel, bandwidth = 2: too steppy
Simple and informative
Useful for many distributions in one diagram
Modern variant of boxplot
Suffcient data points needed
Not clear if bars overlap
If bars are stacked, comparison of female passengers are impaired
If bars are not stacked, with transparent bars, meaning of third color not clear
Maybe density plot better suited
Density line helps interpretation
Beware of scaling of y-axis
Separate plots even better
In this case age pyramid best
Density estimates of the butterfat percentage in the milk of four cattle breeds
Histograms are not suitable in this case
Mean daily temperatures
Boxplots
Violin plots
Strip plots
Problem overlapping points
Better, Strip plots, jittered
Points are randomly moved horizontally
Sina plots, combination of violin plot and strip plot
Ridgeline plots, half violin plots rotated by 90 degrees
Evolution of movie lengths over time