#ModelMerging

@malikeh97.bsky.social

3 months ago

Excited to be at @neuripsconf.bsky.social next week co-presenting our tutorial: "Model Merging: Theory, Practice, and Applications" 🔥

Proud to do this with my PhD advisor, Colin Raffel, our postdoc research fellow @mcicc.bsky.social, and an incredible panel of speakers 💙

#NeurIPS2025 #ModelMerging

1 0 1 0

GetNews.me

@getnews-me.bsky.social

5 months ago

SyMerge: single-layer adaptation for synergistic model merging

SyMerge introduces a single-layer adaptation technique for synergistic model merging. Read more: getnews.me/symerge-single-layer-ada... #symerge #modelmerging #ai

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Optimizer Noise Shapes Model Merging Success in Neural Networks

Effective noise scale—combining learning rate, weight decay, batch size and augmentation—predicts model‑merging success, with a non‑monotonic optimum. Read more: getnews.me/optimizer-noise-shapes-m... #modelmerging #effectivenoisescale #optimizers

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Chain of Merges: New Layer‑wise Method Improves Model Fusion

Chain of Merges (CoM) merges weights layer‑by‑layer, updates activation stats to curb covariate shift, and hits state‑of‑the‑art results on several benchmarks. Read more: getnews.me/chain-of-merges-new-laye... #modelmerging #chainofmerges

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Task Vectors Match Gradient Descent: Theory Improves Model Merging

Study shows a one‑epoch fine‑tune yields a vector = –η∇L, linking arithmetic to gradients. Vision benchmarks confirm the first‑epoch gradient dominates, enabling merging. Read more: getnews.me/task-vectors-match-gradi... #taskarithmetic #modelmerging

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

New Scaling Laws Reveal Predictable Gains from Model Merging in LLMs

Researchers unveiled a compact power-law scaling rule linking base model size and number of experts (k), showing gains drop about 1/k. Paper submitted September 2025. Read more: getnews.me/new-scaling-laws-reveal-... #modelmerging #scalinglaws #llms

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Model Merging Increases Vulnerability to Adversarial Transfer Attacks

A recent study finds model merging provides little protection against transfer attacks, with a relative success rate over 95% across eight merging methods. Read more: getnews.me/model-merging-increases-... #modelmerging #adversarial

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Model Merging Enables Tunable Reasoning Performance in LLMs

Model merging blends two pretrained LLMs via weight arithmetic, creating models; the study found Pareto improvements where the merged model beat a parent on accuracy and token usage. Read more: getnews.me/model-merging-enables-tu... #modelmerging #llms

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Model Merging Boosts Domain‑Specific Ad‑hoc Retrieval Effectiveness

Linear weight merging of a retrieval model with a domain‑specific one improves search in medicine and Japanese text, matching LoRA fine‑tuning even with limited data. Read more: getnews.me/model-merging-boosts-dom... #modelmerging #adhocretrieval

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Evaluation Pipeline Connects Model Merging Behavior and Internals

Researchers merged Qwen2.5 models, then tested them on the MMLU benchmark and probing of morphology and syntax, finding stronger linguistic knowledge despite mid scores. Read more: getnews.me/evaluation-pipeline-conn... #modelmerging #mmlu #probing

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Privacy-Preserving Evolutionary Merging Improves Language Models

PriME merges language models with evolutionary algorithms, boosting task performance by up to 45% on the LaMP benchmark while cutting membership‑inference risk. Read more: getnews.me/privacy-preserving-evolu... #privacy #modelmerging

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

New Method Superposes Task‑Specific Features for Model Merging

A model‑merging technique superposes task‑specific features with transformation matrices, avoiding fine‑tuning. Benchmarks in NLP and vision showed accuracy exceeding heuristic averaging. Read more: getnews.me/new-method-superposes-ta... #modelmerging #ml

0 0 0 0

Nicole Hennig

@nic221.bsky.social

6 months ago

Text Shot: Model merging is a technique for integrating the knowledge of multiple specialized AI models into a single, more capable model. Instead of fine-tuning, which refines a single pre-trained model using new data, merging combines the parameters of several models simultaneously. This process can consolidate a wealth of knowledge into one asset without requiring expensive, gradient-based training or access to the original training data. For enterprise teams, this offers several practical advantages over traditional fine-tuning. In comments to VentureBeat, the paper’s authors said model merging is a gradient-free process that only requires forward passes, making it computationally cheaper than fine-tuning, which involves costly gradient updates. Merging also sidesteps the need for carefully balanced training data and mitigates the risk of “catastrophic forgetting,” where a model loses its original capabilities after learning a new task. The technique is especially powerful when…

How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining venturebeat.com/ai/how-sakana-ais-new-ev... #AI #ModelMerging

6 0 0 0