🛠️ Projects & Experiments

A collection of AI tools and experiments I've built - from hackathon wins to weekend hacks.

Turbopuffer DataSink Connector for Ray Data

AI InfrastructureMember of Technical Staff

current

Built and shipped production-grade vector database connector to Ray OSS (PR #58910), enabling streaming writes from Ray Data pipelines to Turbopuffer. Solved complex memory optimization using sort+slice over dictionary accumulation for zero additional allocation.

Unblocked Notion migration to Anyscale, contributing to 6x contract growth ($40K to $250K). Implemented column-oriented batching for 10x write performance. Fixed PyArrow hash-order bug preventing silent data corruption.

Ray Data
PyArrow
Turbopuffer
Python
AWS

Petabyte-Scale Robotics Data Pipeline

RoboticsMember of Technical Staff

current

Designed Ray Data pipeline for autonomous systems processing 3+ petabytes of sensor data. Direct MCAP-to-tensor pipeline with on-the-fly H265 decoding, eliminating multi-day ETL bottlenecks.

Enabled heterogeneous compute architecture (CPU data processing, GPU training) at 512-node H100 scale. Reduced S3 traffic and intermediate artifacts by streaming directly from raw sensor data.

Ray Data
MCAP
H265
PyTorch
H100 GPUs
Kubernetes

DemoDrive - AI Video Editor for DevRel

DevToolsFounder & CEO

past

Built AI-powered video editor from scratch with AI agents as first-class citizens. Created 120+ automated videos for 5 pilot customers.

Reduced content creation time by 70%. Secured 1 paid pilot (Whiterabbit.ai, Series C) and 2 design partners (E2B.dev, FireworksAI).

Django
React
Claude 3.5 Sonnet
Gemini Flash
Remotion
Playwright
ffmpeg

AI House Tour Video Generator

Real EstateSolo Developer

past

Won Luma AI Hackathon as the only solo participant. Built tool that generates cinematic house tour videos from Zillow listings using AI.

Demonstrated end-to-end AI video generation pipeline producing 100+ videos from real estate data.

Luma AI
Python
Zillow API

ProoferX - AI Documentation Validator

DevToolsTech Lead

past

Won Code Interpreter 2.0 Hackathon by building an AI tool that tests code in technical guides to make sure they actually work.

Found incomplete/incorrect code examples in OpenAI, Vite, and E2B docs. Saves developer time by automatically catching outdated documentation.

CrewAI
E2B Sandbox
Fireworks AI
LangChain

LazyPMs - AI Release Notes Generator

DevToolsTech Lead

past

Won LangChain Factory Hackathon by building an agentic system that automates writing and tailoring software release notes for different stakeholders using a multi-agent architecture.

Created a solution that transforms sparse release notes into rich documentation tailored for different audiences (CEO, developers, downstream teams) through coordinated AI agents.

Langgraph
LangChain
Fireworks AI
GitHub API

KinConnect - AI Hackathon Team Matcher

EventsTech Lead

past

Built and won MongoDB GenAI Hackathon by creating an AI-powered tool that matches hackathon participants based on their profiles, skills, and interests.

Won $2000 in Fireworks AI credits. Created a scalable solution using hybrid search (vector + keyword) for optimal matching, with costs under $1 for development.

Fireworks AI
MongoDB Atlas
FastAPI
LangChain

AI-Driven Personal Budget Assistant

Personal FinanceAI Engineer

past

Currently building an AI-driven application designed to automate the categorization of personal financial transactions. This project involves using state-of-the-art Large Language Models (LLMs) to classify transactions based on descriptions and enriched context from Google search results.

The initiative aims to reduce manual categorization errors, enhance user experience by minimizing the need for manual input, and improve financial management efficiency.

Python
OpenAI API
Groq
LangSmith

Insurance Claims Processing with LLM RAG

InsuranceTech Lead

past

Led 7-person team building LLM RAG system for life insurance policy processing. Built OCR ingestion pipeline using AWS Textract with LangChain integration.

Achieved 82% accuracy, reduced actuarial workflow dependencies by 64%, saving ~$5M annually.

AWS
Kubernetes
Langchain
FastAPI
React
Sagemaker

Telecom Customer Retention Enhancement

TelecommunicationsTech Lead

past

Initiated and led the development of advanced machine learning models that successfully increased the customer win-back rate by 11%.

Enhanced customer retention strategies and set a new benchmark for predictive analytics in the telecommunications sector.

AWS Databricks
Airflow
Dbx

Mining Operations Optimization

MiningML Tech Lead

past

Managed a team of 9 data engineers and machine learning experts to implement cutting-edge ML solutions that enhanced core mining processes.

Reduced mining carbon footprint by 4%. Implemented MLOps best practices for model versioning and deployment, involving 18 models in production retrained monthly.

Azure Databricks
Dagster
Azure
Snowflake
dbt
pyGAM

CPG Supply Chain Optimization

Consumer Packaged GoodsML Tech Lead

past

Designed and implemented a comprehensive org-wide quality solution that significantly enhanced the reliability of data products, driving business decisions.

Increased revenue by 18% within a single quarter through predictive modeling of customer fulfillment rates.

Snowflake
dbt
Azure
AzureML

Data-Driven Transformation at SnapTravel

Travel and HospitalityEngineering Manager - Data

past

Led a team of 12 to overhaul SnapTravel's data platform, integrating advanced analytics that supported strategic business decisions.

Helped pivot the company from growth-focused strategies to profitability in just three months during the challenging first quarter of the COVID-19 pandemic.

Airflow
DBT
Snowflake
Looker
Dynamo DB
Spark

Athletigen Data Intelligence Platform

Health and FitnessFounding Engineer

past

Founded and led the development of a data intelligence platform integrating various data sources and accessed by thousands of users.

Python
Spark
AWS (Redshift, Lambda, EC2, S3)
R
d3.js
MongoDb