Data and methods detail: Odds ratio, Yule’s colligation, Chi-square test

Data: Risk Management Agency cause of loss and summary of business data

We used Risk Management Agency Cause of Loss data to learn where producers filed claims to recoup drought-related insured losses. In consultation with RMA experts, we used six of the 48 loss codes to identify drought-related losses. Fields we used in our analysis were cause of loss, month and year of loss, county of loss, crop, and policies indemnified. We summarized the data for analysis by whether there was one or more policy indemnified by county, month, year, and crop.

We used RMA’s annual Summary of Business data to learn the years and counties in which different crops were insured. We inferred zeros – county-months without claims – if a crop was insured in a given county and year and if producers did not make a claim. “Missing values” appear on maps when no policies were sold for that crop in that county.

Data: Growing season

We used USDA’s “Usual Planting and Harvesting Dates for U.S. Field Crops” for a rough identification of which months are most relevant for each crop and state combination, which produced stronger relationships in the odds ratio analysis. When no data was available, we estimated growing season dates based on data from neighboring states. We used the earliest and latest dates within each state. More comprehensive data at finer spatial resolution would likely yield better results.

Data: Drought status

We used U.S. Drought Monitor county-level data. To convert from weekly to monthly data, we used values from the first map of each month. We assigned a category to each claim based on the worst category of drought affecting any part of the county, and calculated a yes/no “drought” variable. We defined drought according to each of the five different levels , from D0, abnormally dry, to D4, exceptional drought.

Analysis: Odds ratio and Chi-square test

To see whether the odds of drought-related claims being filed are greater during drought, we created a contingency table comparing county-months with and without drought and with and without drought-related claims (Table 1). Computing an odds ratio allowed us to characterize the odds that drought would result in claims for a given crop and county (Equation 1). A chi-square test (Equation 2) with one degree of freedom in all cases provided a measure of statistical significance as well as associated expected values and standardized residuals statistics. Expected values are calculated by multiplying the row and column totals for each cell and then dividing by the total number of cases in the table. Expected values tell us in this case how many claims there would be if drought had no effect. The chi-square test is a comparison of observed and expected values. For the chi-square test of independence to be valid, each of the expected values needs a value of at least five. We mapped odds ratios to detect spatial patterns and added p value symbology.

Claims
Drought Yes No
Yes a b
No c d

a = claims during drought
b = no claims during drought
c = claims during no-drought
d = no claims during no drought

Equation 1:
Equation 2:

Where Oi is the observed value for each cell and Ei is the expected value for each cell.

Analysis: Yule’s coefficient of colligation

From the 2x2 matrix we also computed Yule’s coefficient of colligation