Show HN: Thaw – Git branch for a running LLM (fork agents, skip prefill)

Hacker News Show AI
Generative AI Reinforcement Learning

I built thaw because forking an LLM agent is absurdly wasteful today. When an agent explores N branches - RL rollouts, best-of-N, parallel coding attempts - each branch re-runs prefill over the same shared context. You pay for the same prompt N times. thaw snapshots a live inference session - weights, KV cache, scheduler state, and the prefix-hash table - and hydrates N children that diverge from the fork point without re-prefilling. It's `git branch` for a running model.