The Causal Effect of the Maui Wildfire on Maui's Unemployment Rate with CausalImpact (python) and Dynamic Time Warping (tslearn)
This isn't actually a picture of Maui but from the lava flow on the Big Island- where the lava met the forest. I think it conveys the same message though.
Here's a short tutorial on evaluating the estimated casual effect from the recent Maui wildfires on the local unemployment rate using data from the Bureau of Labor Statistics (BLS).
Rather than measure the correlation or association, here I'll dive into causality using the python implementation of the CausalImpact library and use the Dynamic Time Warping algorithm from tslearn in an attempt to choose the most similar counties to measure against as controls ( a form of Market Matching ).
Please find the code in my GitHub repo: https://github.com/sam-tritto/maui-wildfires-unemployment-rate
OUTLINE
Intro to Causality & It's Assumptions
A Simple Causal Impact Example
The Bureau of Labor Statistics Dataset
Setting Up the Python Environment
Pre and Post Intervention Periods
Market Matching with Dynamic Time Warping
Data Pre-Processing
Causal Impact and the Estimated Causal Effect
Intro to Causality & It's Assumptions
First, let's provide a concise overview of causality and causal inference. The central question we aim to address is the impact of the Maui Wildfires on Maui's Unemployment Rate, with a focus on understanding the causal relationship, marked by the term "effect." When discussing the effects of an event, we delve into causality rather than mere correlation, emphasizing the implied directionality in the relationship between variables.
Evaluating causality can be challenging, and two primary study types exist for this purpose: Observational and Experimental. Experimental studies, often considered the gold standard and referred to as Randomized Control Trials (RCTs) or A/B tests, employ randomness to ensure the similarity of control and pilot groups. This similarity allows researchers to attribute any outcome differences to the event under study. While highly controlled, experimental studies demand meticulous planning due to their complexity.
On the other hand, Observational studies are more accessible to implement, even post-event occurrence. However, their ability to establish causality is limited. To address this limitation, we turn to slightly more advanced modeling methods. Given that the Maui Wildfires constitute a naturally occurring phenomenon, our study falls into a subtype of Observational studies known as Natural Observational Studies. This subtype involves observing events in their natural context and contributes valuable insights, despite the inherent challenges associated with establishing causal relationships.
Causal modelling is an evolving field and there are many different options available. Many of these are regression based and rely on establishing a DAG (directed acrylic graph) and accounting for many potential confounders, mediators, and instrumental variables. They also have many strict assumptions that need to be met. The method that I'll be using here is a much simpler Bayesian approach sometimes called "spike and slab". "Spike and slab" is a Bayesian modeling technique that combines elements of sparse modeling with a mixture of distributions. The term "spike" refers to a probability distribution centered around zero (a "spike" at zero), which is used to model the null hypothesis or the absence of an effect. The term "slab" refers to a wider distribution that allows for non-zero effects. In essence, this approach simultaneously considers the possibility of no effect (the spike) and the possibility of an effect (the slab). The balance between the spike and slab components is determined by the data during the modeling process.
One of the most widely established packages for this type of causal modeling is the CausalImpact library origianally developed by Google engineers and available in R. It is designed to estimate the causal effect of a predefined intervention or treatment on a response variable by comparing the observed data to a counterfactual prediction. Here I'll be using the python implementation which is built off of tensorflow. CausalImpact employs Bayesian Structural Time Series models, which capture the underlying patterns and components of a time series, such as trend, seasonality, and regression effects. The library generates a counterfactual time series, representing what the response variable would have been in the absence of the intervention. This is based on the data observed before the intervention.
The CausalImpact library, like any causal inference method, relies on certain assumptions to provide accurate and meaningful results. Understanding and validating these assumptions is crucial for the reliable interpretation of the causal impact estimates. Here are the key assumptions associated with the CausalImpact library:
Stationarity of Pre-Intervention Period:
The pre-intervention period should exhibit a stable and stationary behavior. This means that the patterns and trends observed before the intervention should be representative of the counterfactual scenario in the absence of the intervention.
Additivity of Effects:
The effect of the intervention is assumed to be additive. This means that the total effect on the response variable is the sum of individual effects.
No Other Concurrent Interventions:
The analysis assumes that there are no other significant interventions or external factors influencing the response variable during the post-intervention period. The causal impact estimates are attributed solely to the specified intervention.
Correct Model Specification:
The chosen structural time series model adequately captures the underlying patterns and components of the time series data, such as trend, seasonality, and regression effects. Incorrect model specification may lead to biased estimates.
Exogeneity of Predictors:
The predictor variables used in the model are exogenous, meaning they are not affected by the intervention and do not influence the response variable directly.
Temporal Independence:
The residuals (the differences between the observed and predicted values) are assumed to be temporally independent. This assumption is crucial for the validity of statistical inferences.
No P-Hacking or Data Snooping:
The analysis assumes that the pre-intervention and post-intervention periods were chosen a priori, without knowledge of the intervention's impact. This helps avoid biases introduced by data snooping or p-hacking.
Common Trends:
There should be common trends between the pre-intervention and post-intervention periods. This assumption ensures that the counterfactual prediction is based on relevant historical patterns.
It's important to note that violating these assumptions can lead to biased estimates and unreliable causal impact results. Therefore, before using the CausalImpact library, it's recommended to carefully examine and validate these assumptions based on the characteristics of your data and the context of the intervention.
A Simple Causal Impact Example
Before we dive into the Maui Wildfire example, which utilizes many covariates and timeseries in building it's counterfactual syntheic control, I thought it might be helpful to look at a simple example first with only one timeseries. Here I've gotten some data from the Google Trends site where we can investigate the effect of ChatGPT on the interest in AI. Here is a chart from Google Trends showing the interest over time in the two seach terms; AI in blue and ChatGPT in red. You can clearly see that since ChatGPT was relased there has been an increased interest in AI - let's quantify it!
We are going to investigate specifically the blue line and run this timeseries through our CausalImpact library with the release date of ChatGPT as the intervention date. Below is the resulting chart where the intervention date is the grey dashed vertical line. The model creates a counterfactual prediction line in orange, which is what our model thinks the interest in AI would have been had ChatGPT not been released. The black line is the actual interest in AI. And the difference between these two is the estimated causal impact as seen in the middle chart. The bottom chart is the cumulative effect on the interest in AI.
The model also prints out a few really nice summaries, here is the executive summary below. From this you can gather the estimated causal effect of 42.76 with it's associated 95% credible interval of [40.24, 45.0]. Since this is a Bayesian model this interval represents that there is a 95% probability that the effect is in this interval. At the bottom you can even gather a Bayesian equivalent p-value which for us here is 0.0.
The Bureau of Labor Statistics Dataset
Now we can explore a slightly more complicated example with multiple covariates. We'll start by gathering unemployment data from the BLS. We can do this many ways. They have a robust API with a python client, many accessible web tools, and even though the Moody's Analytics Excel Add-On. Since this is a one time data pull I'll use the web tools to manually download the needed data, otherwise I might opt for the python API.
You can use this link to navigate to the BLS Data. There are many different datasets avaiable and also many different ways to interact with each dataset. I'll be using the LAUS data here which contains data at the county level. For manual pulls I like to use the One Screen option which provides a nice filterable interface. I've gathered the unemployment data at the county level, and since this is focused on Maui's unemployment I've chosen to only download counties in Hawai'i, California, and Florida as I thought that they might have similar employment trends due to the temperature and prevelance of their hospitality industries.
The end result, which you can download from my GitHub repo, is a nice table with 129 counites unemployment rates going back each month since 2013. Right off the bat you'll notice that all of the counties do not have similar unemployment rates, which is why we'll be using Dynamic Time Warping to Market Match later on.
Setting Up the Python Environment
I thought it might also be worth covering how to set up a python environemnt to work with these libraries. Since the python implementation of CausalImpact relies on tensorflow libraries, these can often be a bit tricky to install on your machine. This installation can change depending on which OS you are running. I'm working on a Mac with an M1 chip and through VS Code with Anaconda. If that's similar to your setup you can install my environment.yaml file found in my github. Or just try to match these versions the best you can.
Pre and Post Intervention Periods
Once that's taken care of we can get to work on formatting the data. The first thing we'll need to do is get the Date in a datetime format. Below you can see some pretty straightforward pandas manipulations, there's probably a more elegant way to accomplish this - but if it works it works!
Next, we'll need to split the data into pre and post intervention periods. The BLS data is aggregated at the monthly level, so each record represents the unemployment rate at the end of the month. The Maui Wildfire was first reported on August 8th, 2023, which is closer to the start of the month. For this analysis I'll consider the intervention date to be August 1st, 2023, since the unemployment would have likely increased the last 3 weeks of the month and then be reported at the end of the month. So the pre period will be any date prior to August 1st, 2023 and the post period will include the data from August. CausalImpact expects these dates in two lists, seen below.
Market Matching with Dynamic Time Warping
Since we currently have 129 counties in our dataset, we'll want to limit the data to only the counties that are most similar to our pilot county (Maui). The CausalImpact library is going to use these counties' data to try to work out what the counterfactual is post intervention. This is what Maui's unemployment rate would have been, had there not been a wildfire. So we'll want to feed the model only counties' data which is most similar to Maui's. This process is sometimes called Market Matching. There is a fantastic R library called, MarketMatching, which combines both the CausalImpact library with the dtw library in an easy to use wrapper. The workflow I'm presenting here is intended to be a similar python implementation of this. For the Dynamic Time Warping algorithm I'll use the tslearn library, which has a host of helpful functions for complex time series tasks in python.
To understand what is Dynamic Time Warping you can refer to the image below, where you can see the two time series matching on the "warping curve" rather than linearly.
From An introduction to Dynamic Time Warping by Romain Tavenard.
Specifically we'll be using the KNeighborsTimeSeries() function to choose our control counties. First, we'll need to split the data up into test and train sets where the test set is Maui county and the train sets are all of the other avaiable counties. We'll also want to focus on the similarity of the pre intervention period so we can also subset by that. We'll also MinMax scale the time series data to be between 0.0 and 1.0, similar to how you would scale before any KNN algorithm. The tslearn library also has a handy to_time_series_dataset() function to convert pandas data frames into a time series numpy array which it requires for modeling.
Once the data has been scaled and transformed we can fit the KNeighborsTimeSeries() function. I've chosen to find the top 30 closest counties to Maui and also set the metric parameter to "dtw" which is short for Dynamic Time Warping. The function outputs two arrays; one containing the indexes of the chosen counties sorted by similarity and the other their respective distances to Maui county's unemploymnet rates. The indexes can be matched up to the order that was fed into the algorithm. When ispecting these 30 chosen counties it makes sense to see the other Hawai'i counties at the top of the list. More supprisingly Monroe County, FL has also made it to the top of the list, but a quick search reveals that this is the county for Key West and the rest of the Florida Keys... so that actually makes a lot of sense. At this point we have now narrowed down our control list to 30 of the most similar counites in terms of unemployment rates.
Data Pre-Processing
The last step before modeling is to transform the data into it's required format. I've set the date to be the index and have pivoted each county to be it's own column. One really important note here is that the CausalImpact library expects the pilot to be the first column and all the covariates in the following columns. In our example the covariates are the 30 selected counties with similar unemployment rates. CausalImpact is very flexible here and if you wanted to include other time series covariates such as population, demographics, GDP, etc... it wouldn't complain. Just make sure that the data is all on the same scale.
Causal Impact and the Estimated Causal Effect
This last step couldn't be easier and this is the reason why I really love working with this library. Here you can see the output of the below call to the model. Note there are only 2 months of post intervention here, which is admittedly short. There are also some opther optional parameters (nseasons and seasonal_duration for seasonality trends and prior_level_sd for controling the prior distributions standard deviation) we could work with here which I am not using.
Reading the summary we can see that the Maui Wildfires on average have casued an increase of 3.66 percentage points to their unemployment rate with a 95% probability between 1.45 and 5.87 and associated p-value of 0.0.
That's it. Now you have a workflow similar to that from the MarketMatching library in R, but implemented in python.