Trending

#ModelMerging

Latest posts tagged with #ModelMerging on Bluesky

Latest Top
Trending

Posts tagged #ModelMerging

Post image

Excited to be at @neuripsconf.bsky.social next week co-presenting our tutorial: "Model Merging: Theory, Practice, and Applications" 🔥

Proud to do this with my PhD advisor, Colin Raffel, our postdoc research fellow @mcicc.bsky.social, and an incredible panel of speakers 💙

#NeurIPS2025 #ModelMerging

1 0 1 0
SyMerge: single-layer adaptation for synergistic model merging

SyMerge: single-layer adaptation for synergistic model merging

SyMerge introduces a single-layer adaptation technique for synergistic model merging. Read more: getnews.me/symerge-single-layer-ada... #symerge #modelmerging #ai

0 0 0 0
Optimizer Noise Shapes Model Merging Success in Neural Networks

Optimizer Noise Shapes Model Merging Success in Neural Networks

Effective noise scale—combining learning rate, weight decay, batch size and augmentation—predicts model‑merging success, with a non‑monotonic optimum. Read more: getnews.me/optimizer-noise-shapes-m... #modelmerging #effectivenoisescale #optimizers

0 0 0 0
Chain of Merges: New Layer‑wise Method Improves Model Fusion

Chain of Merges: New Layer‑wise Method Improves Model Fusion

Chain of Merges (CoM) merges weights layer‑by‑layer, updates activation stats to curb covariate shift, and hits state‑of‑the‑art results on several benchmarks. Read more: getnews.me/chain-of-merges-new-laye... #modelmerging #chainofmerges

0 0 0 0
Task Vectors Match Gradient Descent: Theory Improves Model Merging

Task Vectors Match Gradient Descent: Theory Improves Model Merging

Study shows a one‑epoch fine‑tune yields a vector = –η∇L, linking arithmetic to gradients. Vision benchmarks confirm the first‑epoch gradient dominates, enabling merging. Read more: getnews.me/task-vectors-match-gradi... #taskarithmetic #modelmerging

0 0 0 0
New Scaling Laws Reveal Predictable Gains from Model Merging in LLMs

New Scaling Laws Reveal Predictable Gains from Model Merging in LLMs

Researchers unveiled a compact power-law scaling rule linking base model size and number of experts (k), showing gains drop about 1/k. Paper submitted September 2025. Read more: getnews.me/new-scaling-laws-reveal-... #modelmerging #scalinglaws #llms

0 0 0 0
Model Merging Increases Vulnerability to Adversarial Transfer Attacks

Model Merging Increases Vulnerability to Adversarial Transfer Attacks

A recent study finds model merging provides little protection against transfer attacks, with a relative success rate over 95% across eight merging methods. Read more: getnews.me/model-merging-increases-... #modelmerging #adversarial

0 0 0 0
Model Merging Enables Tunable Reasoning Performance in LLMs

Model Merging Enables Tunable Reasoning Performance in LLMs

Model merging blends two pretrained LLMs via weight arithmetic, creating models; the study found Pareto improvements where the merged model beat a parent on accuracy and token usage. Read more: getnews.me/model-merging-enables-tu... #modelmerging #llms

0 0 0 0
Model Merging Boosts Domain‑Specific Ad‑hoc Retrieval Effectiveness

Model Merging Boosts Domain‑Specific Ad‑hoc Retrieval Effectiveness

Linear weight merging of a retrieval model with a domain‑specific one improves search in medicine and Japanese text, matching LoRA fine‑tuning even with limited data. Read more: getnews.me/model-merging-boosts-dom... #modelmerging #adhocretrieval

0 0 0 0
Evaluation Pipeline Connects Model Merging Behavior and Internals

Evaluation Pipeline Connects Model Merging Behavior and Internals

Researchers merged Qwen2.5 models, then tested them on the MMLU benchmark and probing of morphology and syntax, finding stronger linguistic knowledge despite mid scores. Read more: getnews.me/evaluation-pipeline-conn... #modelmerging #mmlu #probing

0 0 0 0
Privacy-Preserving Evolutionary Merging Improves Language Models

Privacy-Preserving Evolutionary Merging Improves Language Models

PriME merges language models with evolutionary algorithms, boosting task performance by up to 45% on the LaMP benchmark while cutting membership‑inference risk. Read more: getnews.me/privacy-preserving-evolu... #privacy #modelmerging

0 0 0 0
New Method Superposes Task‑Specific Features for Model Merging

New Method Superposes Task‑Specific Features for Model Merging

A model‑merging technique superposes task‑specific features with transformation matrices, avoiding fine‑tuning. Benchmarks in NLP and vision showed accuracy exceeding heuristic averaging. Read more: getnews.me/new-method-superposes-ta... #modelmerging #ml

0 0 0 0
Text Shot: Model merging is a technique for integrating the knowledge of multiple specialized AI models into a single, more capable model. Instead of fine-tuning, which refines a single pre-trained model using new data, merging combines the parameters of several models simultaneously. This process can consolidate a wealth of knowledge into one asset without requiring expensive, gradient-based training or access to the original training data.

For enterprise teams, this offers several practical advantages over traditional fine-tuning. In comments to VentureBeat, the paper’s authors said model merging is a gradient-free process that only requires forward passes, making it computationally cheaper than fine-tuning, which involves costly gradient updates. Merging also sidesteps the need for carefully balanced training data and mitigates the risk of “catastrophic forgetting,” where a model loses its original capabilities after learning a new task. The technique is especially powerful when…

Text Shot: Model merging is a technique for integrating the knowledge of multiple specialized AI models into a single, more capable model. Instead of fine-tuning, which refines a single pre-trained model using new data, merging combines the parameters of several models simultaneously. This process can consolidate a wealth of knowledge into one asset without requiring expensive, gradient-based training or access to the original training data. For enterprise teams, this offers several practical advantages over traditional fine-tuning. In comments to VentureBeat, the paper’s authors said model merging is a gradient-free process that only requires forward passes, making it computationally cheaper than fine-tuning, which involves costly gradient updates. Merging also sidesteps the need for carefully balanced training data and mitigates the risk of “catastrophic forgetting,” where a model loses its original capabilities after learning a new task. The technique is especially powerful when…

How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining venturebeat.com/ai/how-sakana-ais-new-ev... #AI #ModelMerging

6 0 0 0
Preview
Combining Large Models Unlocks New Levels Of Performance In AI Research Researchers explored large-scale model merging, showing that combining large instruction-tuned models improves performance and generalization across various tasks, even surpassing multitask-trained mo...

🔥🤖📈Combining Large Models Unlocks New Levels Of Performance In AI Research www.azoai.com/news/2024101... #AIresearch #modelmerging #scalability #generalization #largemodels #expertmodels #zeroshot #instructiontuning #multitasktraining

0 0 0 0
Post image

Kollektive KI-Intelligenz – Evolutionäre Algorithmen können Sprachmodelle weiterentwickeln

#KünstlicheIntelligenz #artificialintelligence #KI #AI #FoundationModels #Evolution #ModelMerging #Sprachmodelle #EvolutionäreAlgorithmen

kinews24.de/kollektive-k...

0 0 0 0