If you run models locally and still fuzzy on how quantization actually works — this 50-min screencast is the one.
Grad-level lecture, no paywalls, no fluff. PTQ, calibration, bit-width — all of it.
🔗 reddit.com/r/LocalLLaMA/s/MsRkMjohOv
#Quantization #LocalLLM #llmcpp