Here I'll evaluate the new Gemini 2.0 Flash Thinking model against the MATH Vision dataset using the LLM as a Judge technique. I'll show that this new model has impressive state of the art accuracy and just might be the best scoring model available right now.
Here's a short tutorial on evaluating the estimated casual effect from the recent Maui wildfires on the local unemployment rate using data from the Bureau of Labor Statistics (BLS). Rather than measure the correlation or association, here I'll dive into causality using the python implememntation of the CausalImpact library and use the Dynamic Time Warping algorithm from tslearn in an attempt to choose the most similar counties to measure against as controls ( a form of Market Matching ).
Here's a short tutorial on how you can set up a LLM on GCP to talk to your BigQuery data through VertexAI using the PaLM 2 LLM... otherwise known as Table Q&A. I'll store a sample HR Attrition dataset from IBM in BigQuery and then set up a LLM in order for us to chat with the data. We'll be able to ask it simple questions and validate it's answers all in only a few lines of code.
Here is a comprehensive project from end to end where I analyze survey data to find measureable company metrics which are most relevant to the survey to use as proxy metrics for measurement throughout the year. Also these survey responses are fed into Google's Gemini model for sentiment analysis and topic summarization, at scale.
This will be Part 1 of a tutorial on how to create a simple Flask web app, which will ultimately help a user create a playlist on their Spotify account containing the most popular songs from artists that will be playing in their area in the upcoming months. Part 1 will set up a simple ETL data process through GCP focusing on pulling data from the APIs of both Spotify and SeatGeek, combing the data, and then uploading/automating the process through GCP using App Engine, Cloud Scheduler, Cloud Storage, and Secret Manager.
An implementation of Student's Paired t-Test for Means from end to end. This test is the appropriate test for comparing the means of one group sampled twice (once before and once after an intervention) with small-ish to large sample sizes in an A/B Testing scenario.
An implementation of Student's Unpaired t-Test for Means from end to end. This test is the appropriate test for comparing the means between 2 independent but similar groups with small-ish to large sample sizes in an A/B Testing scenario.
An implementation of the Z-Test for Proportions from end to end. This test is the appropriate test for comparing the proportion of binary data between 2 independent groups with different and large sample sizes in an A/B Testing scenario.
An implementation of Fisher's Exact Test for Proportions from end to end. This test is the appropriate test for comparing the proportion of categorical data between 2 independent groups with small sample sizes in an A/B Testing scenario. I'll also go over Barnard's and Boschloo's Exact tests which are both considered improvements to Fisher's test.
An implementation of the Binomial Test for Proportions from end to end. This test is the appropriate test for comparing the proportion of binary data between 2 independent groups with different sample sizes in an A/B Testing scenario.
Here's a walkthrough of 4 different flavors of Bayesian regression with inference, each around a seperate case study or scenario using synthetic data. This might be interesting for someone who is familiar wth the concept of regression and has always wondered what the fuss is with Bayesian statistics. You'll see that while it might require the use of pymc, a library for Bayesian computation, the structure is very similar to the Frequentist approach. You might even find that inference with Bayesian statistics is more flexible and more insightful.
For this product Data Science project I’ll explore the use of Bayesian Inference in A/B testing using the PyMC3 library. Using synthetic data, the idea behind the project will be to test 4 new playlist algorithms against the current algorithm. The metrics will focus on user interaction during the first selected song, and the metrics measures will be the skip rate and the average time it took a user to skip the song.
In this project I'll attempt to forecast Hawai'i Median Home Prices with the prophet library, and explore some intermediate features while doing so. I'll take a look at seasonality, changepoints, growth modes, anomaly omission, and prior scales in order to find plausibly accurate forecast for Home Price. And while this typically would be fairly straight forward, we'll see that the pandemic has gievn us some volatility that needs to be accounted for in order to find a nice fitting model.
This will be a very short project where I'll forecast CO2 emmissions recorded on top of Mauna Loa on the Big Island of Hawai'i using the prophet library in python. While it won't be the most complex trend, I mostly wanted to forecast this data having lived on the Big Island for a handful of years and even walked right up to the lava flow a few times... I couldn't pass up the opportunity. Plus prophet is just so easy to setup and use, even tuning it is fairly straight forward. I'll probably circle back and do a more complicated forecast later on, but for now let's holo holo!
Some projects from my grad school days. Also back when twitter was twitter!
These are kind of outdated. I'm pretty sure this was pre-GPT 2.0 days, before LLMs were a thing or at least around that time. But I'm fairly confident I was the first person to publish an article on how to access the Spotify API for podcast data as they had just released that endpoint.
These are pretty old! There are probably many modeling mistakes, programming bad practices, etc. I wouldn't follow these too closely, but I'm going to keep them here because I still think they're interesting and to show growth.