Task Vectors Match Gradient Descent: Theory Improves Model Merging
Study shows a one‑epoch fine‑tune yields a vector = –η∇L, linking arithmetic to gradients. Vision benchmarks confirm the first‑epoch gradient dominates, enabling merging. Read more: getnews.me/task-vectors-match-gradi... #taskarithmetic #modelmerging
0
0
0
0