Yuan Yin (@yuanyinnn)

GitHub - valeoai/peft-ipa Contribute to valeoai/peft-ipa development by creating an account on GitHub.

8/IPA is just a first step. We believe that a deeper understanding of the feature space is important to unlocking better model adaptation.💭

To try it out and find more details👇:

Arxiv: arxiv.org/abs/2509.04398
Code: github.com/valeoai/peft...

02.12.2025 11:11 👍 2 🔁 0 💬 0 📌 0

7/Across benchmarks, IPA consistently outperforms standard LoRA and DoRA.
📊Commonsense Reasoning: +1.5 points avg accuracy.
🖼️VTAB-1k (Vision): +2.3 points avg accuracy.
It is also robust at very low ranks (e.g., r=8) where standard LoRA fails.

02.12.2025 11:11 👍 1 🔁 0 💬 1 📌 0

6/Does it work? Yes. But the real win is parameter efficiency.

Because the IPA projection captures the feature space, we can freeze it during finetuning.

On Llama-2/3 and Qwen-2.5, IPA matches or surpasses full LoRA/DoRA tuning at rank=32 with 50% fewer trainable params.📉⚡️

02.12.2025 11:11 👍 0 🔁 0 💬 1 📌 0

5/Training autoencoders per layer with backprop is expensive. So we use a classic, efficient method: Incremental PCA.

✅Forward-only (no backprop)
✅Streaming (no huge memory overhead)
✅Fast (approx 10 mins for pretraining)

It creates a robust starting point for the adapter.

02.12.2025 11:11 👍 0 🔁 0 💬 1 📌 0

4/We asked🔎: What if a random projection isn't the best choice?

We introduce IPA.

The intuition💡: The input projection should be reconstructive.

We train the projection to preserve the maximum information from the inputs (like an autoencoder) before the adaptation begins.

02.12.2025 11:11 👍 0 🔁 0 💬 1 📌 0

3/We visualized this by training LoRAs on different tasks with the same init.

The heatmaps tell the story:
1️⃣Matrix A (left) stays frozen near its random start
2️⃣Matrix B (right) adapts to capture task variances

Takeaway: LoRA essentially relies on a fixed, random projection.

02.12.2025 11:11 👍 0 🔁 0 💬 1 📌 0

2/Standard LoRA decomposes updates into two matrices: A (down-projection) and B (up-projection).

Typically at init, A is random and B is zero.

We found a major asymmetry: during training, A remains close to init, while B absorbs almost all the task-specific adaptation.

02.12.2025 11:11 👍 1 🔁 0 💬 1 📌 0

1/Serve your PEFT with a fresh IPA!🍺
Finetuning large models is cheaper thanks to LoRA, but is its random init optimal?🤔
Meet IPA: a feature-aware alternative to random projections
#NeurIPS2025 WS #CCFM Oral+Best Paper
Work w/
S. Venkataramanan @tuanhungvu.bsky.social @abursuc.bsky.social M. Cord
🧵

02.12.2025 11:11 👍 12 🔁 2 💬 1 📌 2

Personnellement, l’année 2024 a été marquée par l’espoir, la perplexité et, enfin, le bonheur. Il est maintenant temps de consolider davantage mon parcours en France. Cheers for 2025.

11.01.2025 23:13 👍 5 🔁 0 💬 0 📌 0

So my Twitter account is reaching a point where it is no longer pushing research-related content from the people that I follow. Only click bait content or hype shit. Time to log out.

10.01.2025 21:29 👍 2 🔁 0 💬 0 📌 0

Yuan Yin

Latest posts by Yuan Yin @yuanyinnn