Projects

C5i Consulting

Current — Data Scientist II

Apply predictive analytics and statistical modeling techniques to assess surveys for key drivers as they relate to outcomes.

Relevant skills:

  • Python
  • Hypothesis testing
  • Regression Analysis (OLS, ordinal)
  • Relative weights analysis (RWA)

OurTownHall

2024 — Machine Learning Engineer

Apply predictive analytics and state of the art Natural Language Processing techniques to improve both employee (micro) and company-wide (macro) infrastructure.

Relevant skills:

  • PyTorch
  • HuggingFace
  • SpaCy
  • LoRA Adapters
  • Google Cloud Platform
  • Web scraping

Activision Data Science

2020-2024 — Data Scientist, People Analytics

Apply predictive analytics and state of the art Natural Language Processing techniques to improve both employee (micro) and company-wide (macro) infrastructure.

Commonly used skills:

  • LSTM
  • Word vectorization
  • Text Processing
  • Multivariate regression/forecasting techniques such as ARIMA

Pro Cycling Stats App

Current — iOS

Design web scrapers and fully connected APIs to collect and display pro cycling statistics and schedules across various levels of professional road and gravel cycling.

Frameworks & concepts:

  • Python (data ingestion and preprocessing)
  • Xcode
  • SwiftUI
  • UIKit

Lane Detection

2021

Apply computer vision and image processing techniques to successfully detect lane lines from a forward facing dash cam.

Techniques:

  • Open CV
  • Hough transforms for crude line detection
  • Image segmentation using masking and polygon stenciling
  • Thresholding

F-MNIST Image Recognition

2021

The F-MNIST dataset consists of 60,000 black and white images of clothing, all of 28x28 resolution.


Challenge

Use multi-class classification techniques to accurately classify images of clothing.

Techniques:

  • Stochastic gradient descent
  • K-Nearest-Neighbors analysis
  • Principal component analysis
  • Error analysis using ROC/AUC curves and confusion matrices
  • K-Fold cross validation methods

Regression Analysis for California Housing Market

2021

24,000 row, 13 column dataset consisting of various parameters including median house price, median size, number of bedrooms, median income, etc.
Each sample is a district (block) consisting of anywhere between 20-400 homes. Dataset parameters consist of averages of each district.

Challenge

Clean and analyze all data. Find emerging trends in housing and why these trends may or may not persist.
Techniques included:

  • Correlation matrix
  • K-Nearest-Neighbors Analysis
  • Principal component analysis
  • Automating data-cleanup using custom transformation pipelines
  • K-Fold cross validation methods