Work - Projects

Project Briefs.

Projects are for proving something. Each entry starts with a brief: the problem, the hypothesis, the goals, the outcome, the tools used, and the notes that matter after the work is done.

All Projects

tooling

Foundation

RAG Drift Monitor

Semantic Drift Watcher

A monitoring layer that compares live retrieval output against stored intent baselines so drift shows up before the model starts sounding wrong.

Python sentence-transformers (all-MiniLM-L6-v2) ChromaDB
Complete Case Study →

tooling

Foundation

Model Routing Middleware

Cost-Aware Prompt Router

A FastAPI middleware layer that inspects task shape and routes requests to the lowest-cost model that can still do the job well.

FastAPI Model routing Cost telemetry
Complete Case Study →

experiment

Foundation

Recursive Agent Loop

Patch-and-Retry Sandbox

A contained patch-and-retry loop where an agent observes its own failures, proposes changes, and keeps iterating until the test passes or the budget runs out.

Sandboxed execution Traceback parsing Patch loop orchestration
Complete Case Study →

prototype

Frontier

Hybrid Retrieval Pipeline

BM25 + Dense + Reranking

A retrieval pipeline that blends BM25, dense vectors, and a cross-encoder reranker so exact-match precision and semantic recall can work together instead of competing.

BM25 (Okapi, implemented from scratch) sentence-transformers (all-MiniLM-L6-v2) Cross-encoder reranking (ms-marco-MiniLM-L-6-v2)
Complete Case Study →

prototype

Frontier

PII Redaction Proxy

Outbound PII Filter

A proxy layer that runs a regex pass for structured PII and a BERT-NER pass for unstructured entities, combining both before any outbound model call.

Regex filtering (8 structured entity patterns) BERT NER (dslim/bert-base-NER, CoNLL-2003) Two-pass detection with span deduplication
Complete Case Study →

live system

Live System

Steve — Voice Agent Sandbox

Public Runtime + Recap Pipeline

A public voice-agent project with signed access, recap lookup, transcript handling, rate limiting, and a private regression harness behind the scenes.

ElevenLabs Astro Cloudflare
Live Demo + Iteration Case Study →

experiment

Local AI

Gemma on Metal

Local Gemma 4 + LoRA on Apple Silicon

A practical test of Gemma 4 on a 16GB MacBook M2: run the small edge model locally with Ollama and Metal, then shape a LoRA tuning path for private document and audio extraction.

Gemma 4 Ollama Apple Silicon
Research + Prototype Case Study →