magic incantation to get llama-bench to work with MTP ?
r/LocalLLaMA
•
Generative AI
Open Source AI
It does not like anything I have tried, including what works with llama-server. is it not built to work with speculative decoding? submitted by /u/jdchmiel [link] [comments]