Welcome to my data portfolio.
Here you can find a sample of some of my personal projects that I've completed throughout my learning journey.
Currently I’m a Data Scientist at The Home Depot on the People Analytics team, and also a Georgia Tech graduate having obtained my M.S. in Analytics.
I’m a Python enthusiast with an interest in Bayesian Statistics, MLOps, Causal Inference, and Experimental Design. I’m also a drummer with an immense passion for creating and listening to music, so you’ll find music in many of my personal projects, too! Feel free to get in touch with me below.
Featured Projects
The Causal Effect of the Maui Wildfires on the Unemployment Rate with CausalImpact (python) and Dynamic Time Warping (tslearn)
Here's a short tutorial on evaluating the estimated casual effect from the recent Maui wildfires on the local unemployment rate using data from the Bureau of Labor Statistics (BLS). Rather than measure the correlation or association, here I'll dive into causality using the python implememntation of the CausalImpact library and use the Dynamic Time Warping algorithm from tslearn in an attempt to choose the most similar counties to measure against as controls ( a form of Market Matching ).
Talk to Your BigQuery Data with GCP (VertexAI PaLM 2 LLM) and LangChain
Here's a short tutorial on how you can set up a LLM on GCP to talk to your BigQuery data through VertexAI using the PaLM 2 LLM... otherwise known as Table Q&A. I'll store a sample HR Attrition dataset from IBM in BigQuery and then set up a LLM in order for us to chat with the data. We'll be able to ask it simple questions and validate it's answers all in only a few lines of code.
MLOps on GCP: Upcoming Local Shows Playlist (DataOps)
This will be Part 1 of a tutorial on how to create a simple Flask web app, which will ultimately help a user create a playlist on their Spotify account containing the most popular songs from artists that will be playing in their area in the upcoming months. Part 1 will set up a simple ETL data process through GCP focusing on pulling data from the APIs of both Spotify and SeatGeek, combing the data, and then uploading/automating the process through GCP using App Engine, Cloud Scheduler, Cloud Storage, and Secret Manager.