Wow! My MLX vs llama.cpp benchmark hit #9 on r/LocalLLaMA today. Did not expect that.
Takeaway: benchmark actual scenarios, do not rely on just the tok/s counter in your UI. Ran into a caching bug specific to Qwen 3.5 (35B-A3B) on MLX. Effective tokens/s is what we experience
#MLX #LlamaCpp #Qwen