National Weather Service fatality records
This post was inspired by this chart tweeted out by my local NWS office. Let me say off the bat that I have a very high regard for what the National Weather Service does (put down the sharpie), but this data visualization left me a bit — unsatisfied.

Wanting to see if I could take a stab at doing something different, I was surprised to quickly find the data – however only available in a PDF. This seemed like a good time to work on my PDF extraction skills.
Thomas Mock recently outlined a method for scrapping pdf’s on his blog. I tried what he outlined - but was still getting some rough results.
I also tried the {tabulizer} package as detailed here. With apologies to Thomas, this was easier - although I had so do some gyrations to deal with the table being split over two pages. Edward Tufte has been credited with the phrase “chart junk” which describes unnecessary or downright confusing “extras” added to charts, often obscuring the fact that they would be better off as a table. This NWS chart is suffering a slight case of chart junk, but I wonder if there is an equivalent phrase to capture what is going on in this table, with extra comments in what should be columns, as well as column names showing up again at the bottom of the table on the seconds page.
We now have the equivalent of the PDF table, minus some of the extra bits, but there are still some comments hanging out in our data, and we have some janky column names.
| Year | Lightning | Tornado | Flood | Hurricane | Heat | Cold | Winter | Rip Curr. | Wind | All Hazard | All Wx |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1940 | 340 | 65 | 60 | 51 | Includes | ||||||
| 1941 | 388 | 53 | 47 | 10 | categories | ||||||
| 1942 | 372 | 384 | 68 | 8 | not listed at | ||||||
| 1943 | 432 | 58 | 107 | 16 | right. | ||||||
| 1944 | 419 | 275 | 33 | 64 | |||||||
| 1945 | 268 | 210 | 91 | 7 | See U.S. |
I’ll use the excellent {janitor} package to clean up the column names, a little regex to get the text out of the all_events column, and then convert all columns to numeric with the new ‘across’ verb in {dplyr}.
Now that was a lot of trouble for a simple data table. First, let’s recreate the NWS chart (minus the 3D). This will require some summary stats for the - somewhat arbitrary - time periods compared.
I think there are a number of issues with the original chart, particularly the ordering of the time periods and the unsorted arrangement of the hazards, even the title is a bit confusing. Let’s take a stab at these. I’ll use the nice new ggplot function to offset the column names.

While an improvement, in my opinion, I think there are some better ways to tackle this information. One thing we could do is add decade trends, rather than these averages - which are inclusive; the 30 year average includes the 10 year average. But why not use all the data? We’ll need to pivot the data to a longer, tidy format.
One way to approach this would be a faceted plot.

Getting better - but how about some ridges?

I think you could make an argument for either of these plots. The faceted plot shows actual trends and data points, but the ridge plot gives a better comparison between the groups and shows that most of these patterns look stochastic. However lightning and rip currents are interesting. Lightning fatalities have dropped dramatically since the 1940s. If we were to make this a per capital rate the change would be even more dramatic. Speculatively - there may be many fewer people working in the agricultural sector that might be exposed to lightening, also increased weather forecasting may be playing a role - most of us are carrying around an up to date weather forecast in our our pockets.

Fatalities related to rip current shows the opposite trend, a steep increase since this metric started to be measured in 2002. This increase is likely related to per capital swimming and beach going, rather than any increase in rip currents themselves. Still, be careful on the beach - I can attest from personal experience that being in a rip current is quite terrifying.

We can also take a quick look at all hazard fatalities - which have trended flat, with the exceptions of some bad years. Never the less, let’s be careful out there folks.
