STT -> LLM -> TTS pipeline

r/LocalLLaMA
Generative AI Open Source AI

Hey guys, I’m trying to learn about how to better create a STT LLM TTS pipeline. My current setup is running a 3090 on Ubuntu. I use llama.cpp to run Qwen 3.6 27B Q4 with pi-agent for tool calling, and I just run everything in the terminal, I haven’t really bothered with chat style front ends. I’m trying to figure out how the actual pipeline goes when using 3 models to process information like that.