# Disclaimer On Data

I am a guy who went to school for physics and works as a software engineer. I am not a climate scientist. I am not doing independent research. I just like data. There is no peer-review process for my results and I am not attempting to propagate uncertainties through all of this. The analyses on this site are the result of me playing with climate data and trying to summarize it in ways that are easy for non-scientists to consume. The original intent of my analyses was for me to condense these data into something my wife could understand to figure out where we can move and settle.

If you come across something from NASA, IPCC, etc. that disagrees with me, they are right and I am wrong. Just assume that whenever there's disagreement. Feel free to let me know about it if you want and I'll try to fix it.

# Basic Methods

**Monthly Temperatures**

1965 temperatures are the averages from Jaunary 1st 1951 to December 31st, 1980 and 2005 temperatures are the averages from January 1st, 1996 until December 31st, 2015. Both use the Berkeley Earth data. I resampled all temperatures to a 1/4 degree lat/lon grid using bilinear interpolation. Since the average temperatures were already available on that grid, I used it as the reference.

Projected future temperatures were calculated by aggregating the CMIP5 temperature projections. These were also resampled to the reference 1/4 degree lat/lon grid using bilinear interpolation. The temperature projection at each point is the median from all of the available projections at that point, so a way to think of it is that half of the models used project a temperature above that value and half project one below that value. Each projected year is the 10 year average centered around that year (e.g., 2025 is January 1st, 2020 - December 31st, 2029).

To determine the value at a particular city, I used bilinear interpolation on the results grid with the lat/lon from google's api for the city. In situations where bilinear interpolation failed (e.g., edges of data set), I used nearest neighbor interpolation.

**Daily Temperatures**

Daily temperatures require a lot more data than monthly ones and my laptop cannot handle much, so these only use three years per year. For example, '2010' is January 1st, 2009 to December 31st, 2011. To calculate daily metrics, I first resampled all daily temperature data sets to the 1/4 deg lat/lon reference grid. I then ran each model for the metrics and then took the median results from all models as the overall value for the metric. As an example, the # of 100 degree days for a given city is not the # of 100 degree days from the median temperature set for all models, but is the median # of 100 degree days from each model. In actual numbers, say you have three models with 101, 97, and 99 F as the temperature on July 1st, 97, 101, and 99 F as the temperature on July 2nd, and 97, 99, and 101 F as the temperature on July 3rd. If you find the median temperature for each day you get 97, 99, 99, so you get no 100 degree days. If you find out how many 100 degree days each model has then take the median, you get that each model projects 1 100 degree day in that time period so the median projection is 1 100 degree day. I chose the method that yields 1 100 degree day for that set. A way to think of it is that half of the models used project more 100 degree days than that value and half project fewer.

As with the monthly temperatures, I used Berkeley Earth's data for the historic daily temperatures and the CMIP5 projections for the future ones.

Heat index worked slightly differently. First, all humidity projections were resampled to the 1/4 degree lat/lon grid using bilinear interpolation. I was unable to find humidity projections for all of the CMIP5 models so I paired the ones that had both temperature and humidity projections, and also took the median humidity projections for all models available for each day in the data set. Then, for models that did not have a paired humidity projection, I used the median for that time period for the heat index calculation. There is also the question of 'which daily humidity corresponds with which daily max temperature?' Heat index goes up as humidity goes up, so you get a higher heat index if you use max humidity than if you use average or min humidity. It's not perfect, but the assumption I made is that daily temperature is highest when relative humidity is lowest. That tends to happen as relative humidity for a fixed amount of humidity goes down as temperature goes up, and the result is that the heat index values I calculated will tend towards being slightly lower than they actually were/will be. Heat index is an inexact definition, so I used the long definition here and checked my values against the NOAA's table. All temperatures below 79 degrees were thrown out as the calculation blows up for certain temperature ranges.

Again, to determine the value at a given city, I used bilinear interpolation on the results grid with the lat/lon from google's api for the city and defaulted to nearest neighbor interpolation when it failed.

**City Rankings**

There is a massive amount of data out there for each city and I wanted some way to aggregate it into simple metrics for comparison. The city rankings are the result. The definitions of the scores are subject to change but at the time of writing they are:

- Heat
- Roughly interpret as how unbearable the heat will be throughout the 21st century
- (# of months with max temp > 95 degrees on average/2) + (# of months with min temp > 80 degrees on average/2) + (# of heatstroke days in 2025/50) + (# of heatstroke days in 2050/50) + (# of heatstroke days in 2095/50) + (# of 100 degree days in 2095/100) + (# of 80 degree nights in 2095/100) + ((hottest month average high - 100)/3) + (numeric representation of Climate Central's state grade on a scale from 0 to 1)
- For reference, Austin, TX in the 2011 heat wave is a B and Phoenix, AZ at its hottest currently is a C

- Drought
- Roughly interpret as how much the area will tend towards droughts throughout the 21st century compared with present day
- (numeric representation of EPA drought rating from model 1 on a scale from 0 to 2) + (numeric representation of EPA drought rating from model 2 on a scale from 0 to 2) + (21st century projected drop in precipitation from EPA/12.5) + (21st century projected drop in precipitation from CMIP5/12.5) + (# of months with average max temperatures above 95 degrees/2) + (numerical representation of current drought state on a scale from 0 to 2) + (numeric representation of Climate Central's state grade on a scale of 0 to 1)

- Coastal Flooding
- Roughly interpret as how much of the metro area might be affected by sea level rise in the 21st century
- (100 - # of km from coastline) + (1/100)*(100 - # of m above sea level)^2 + (numeric representation of Climate Central's state grade on a scale that makes it 1/6 of the total score)

- Earthquake
- Roughly interpret as the max acceleration you'd expect in the area from earthquakes in the 21st century
- Simply the projection from the source for earthquake data

- Inland Flooding
- Roughly interpret as how much worse extreme rain and flooding events will be throughout the 21st century
- (numeric representation of EPA inland flood damage projections on a scale from 0 to 2) + ((EPA absolute extreme rain event increase)^2)/18 + ((CMIP5 relative extreme rain event increase)^2)/312.5) + (numeric representation of Climate Central's state grade on a scale of 0 to 1)

To convert this to a score, I put each one on a scale with 100 as the max and binned 0 - 20 = A, >20 - 40 = B, etc. Note that I did not force there to be a zero value and this is most apparent in inland flooding as even drought areas are projected to have more intense rainfalls and to just have them less often. To calculate the total score, I just compute the weighted rms of the individual scores (weights can be selected using filters) and adjust so that the max is 100. I discount the impact of inland flooding and earthquake risks by default because in my opinion, drought, heat, and sea level rise are existential threats that potentially make areas uninhabitable while earthquake and inland flooding risks are not, but feel free to edit those in the filters.

**General Metrics**

Metro area populations are the populations of the combined statistical areas that the cities are a part of as of December 2016 according to wikipedia.

# Known Issues

- Cities tend to be hotter than the surrounding area. All data sets are on fixed lat/lon grids that average over the entire area, so temperatures are slightly underestimated for the centers of cities.
- Coastal areas tend to have more stable temperatures due to the ocean. All data sets are on fixed lat/lon grids that average over the entire area, so temperatures here are slightly less volatile than they are in reality in coastal cities.
- As noted above, daily, minimum relative humidity values were paired with daily, maximum temperature values to calculate heat index. Since it's not necessarily true that these two are always paired, heat index values are likely to have been slightly underestimated

# Data Sources

**CMIP5 data:**ftp://gdo-dcp.ucllnl.org/pub/dcp/archive/**Berkeley Earth data:**http://berkeleyearth.org/data/**EPA data:**https://www.epa.gov/cira/climate-action-benefits-methods-analysis#framework**More CMIP5 data:**http://maca.northwestknowledge.net/data_portal.php**Climate Central state grades:**http://reportcard.statesatrisk.org/report-card**Earthquake projections:**https://earthquake.usgs.gov/hazards/designmaps/datasets/**BEA Cost of living data**https://www.bea.gov/newsreleases/regional/rpp/rpp_newsrelease.htm