Demo projects

Previous projects

01
Employee Retention Predictive Model
Machine Learning Classification
+

Built a classification pipeline to predict employee attrition, identifying the key behavioral and structural drivers of retention risk. Designed for HR teams to intervene proactively before talent loss occurs.

  • Engineered a full scikit-learn pipeline with preprocessing, cross-validation, and robust evaluation metrics
  • Applied feature importance analysis to surface the top predictors of employee turnover
  • Evaluated performance using AUC-ROC curves and precision/recall tradeoffs to tune for real-world usability
  • Documented findings in a structured report suitable for both technical and non-technical stakeholders
Python scikit-learn pandas Matplotlib AUC-ROC Jupyter Notebook
02
Large-Scale Crime Data Analysis
Data Analysis 1M+ Records
+

End-to-end analysis of the LA Crime Dataset (1M+ records), from raw ingestion through statistical analysis to formatted stakeholder reports. Demonstrates scalable data processing and clear visual communication of findings.

  • Processed and cleaned a 1M+ record dataset using Python and pandas — handling nulls, inconsistent formats, and outliers
  • Applied statistical analysis and visualization to surface actionable crime trends and geographic patterns
  • Automated the full data cleaning and reporting pipeline, generating Excel/CSV outputs for non-technical audiences
  • Focused on making findings operationally useful, not just analytically interesting
Python pandas NumPy Seaborn Matplotlib Excel Reporting Jupyter Notebook
03
Image Classification System
Computer Vision Transfer Learning
+

Developed a custom image classification model using the FastAI framework and transfer learning, achieving strong accuracy on a limited training dataset. Demonstrates the practical application of modern CV techniques without requiring large-scale data.

  • Leveraged transfer learning from pretrained models to achieve high performance with a small custom dataset
  • Trained, validated, and evaluated the model with FastAI's high-level API and learning rate finder
  • Demonstrated that state-of-the-art CV results are achievable outside of large research environments
Python FastAI PyTorch Transfer Learning Computer Vision
04
ML Model Benchmarking Suite
Machine Learning Benchmarking
+

A systematic comparison of multiple classification and regression models on large-scale datasets, focused on quantifying real performance differences across algorithms rather than picking a winner by intuition.

  • Built and compared Logistic Regression, Decision Trees, and Random Forest on datasets exceeding 1M records
  • Applied cross-validation and feature selection to produce reliable, generalizable results
  • Evaluated with multiple metrics — accuracy, precision, recall, F1, AUC — choosing metrics appropriate to each problem type
  • Compiled findings into clear technical reports with visualizations for model comparison
Python scikit-learn Cross-Validation Feature Selection pandas Matplotlib
05
Firewall Log Analysis Automation
Automation Security Analytics
+

Built Python-based automation to transform raw, unstructured firewall log data into clean, structured security reports — reducing manual analyst time and improving turnaround for client security teams at Nuvodia.

  • Developed scripts to parse, clean, and normalize messy raw log data at scale
  • Automated the pipeline from raw input to formatted client-ready report with no manual steps
  • Reduced manual analysis time significantly, freeing engineers for higher-value investigations
  • Deployed in a production environment — actively used by client security teams
Python pandas Automation Log Parsing Excel/CSV Data Pipelines