Show HN: I reduced LLM inference GPU calls by 94% using semantic routing
Hacker News Show AI
•
Generative AI
AI Hardware
On any ubuntu curl -fsSL -o ~/proof.sh bash ~/proof.sh Comments URL: Points: 2 # Comments: 1