They probably save the spoilers until the very end.
I’m not read the all part, but I believe we can take something from data curation part.
They probably save the spoilers until the very end.
I’m not read the all part, but I believe we can take something from data curation part.
This is number 1 on MTEB Leaderboard, but the VRAM requirement may be higher than its peers since the base model is gemma-2-9b-it.
The Phi-4 research team called the human-generated dataset “organic.” Indeed, it must be 100% pesticide-free.
The approach resembles MCTS decoding in a sense that the token which contributes the most for success of the task is likely to choose. However, their approach is more direct: the optimal path is learned by the token probability itself. No need for additional search computation in inference time.
After finding pivotal token, they created synthetic preference dataset which includes (context + accepted token) with positives and (context + rejected token) as negatives.
Especially, they used Pivotal Token Search (PTS) as their post training. In PTS, the model focus of the specific token where it contributes/degenerate success rate by a large margin. The found token is called “pivotal token”.
According to their technical report, Phi-4 uses different approach than Marco-o1 has: careful data curation and enhanced DPO (post training).
arxiv.org/abs/2412.08905
They say Phi-4 14B surpasses Qwen2.5 14B in Math test by a remarkable margin. Is it another o1-like model?
Marco-o1のMCTSちゃんと理解できてなかったのだけど、本来1) 生成モデル、2) 評価関数モデル、3) 報酬モデルの3つが必要だけど、2-3を1のモデルの対数確率で代替しているっぽい。
もしこの簡易アーキテクチャで元のMCTSと同等の性能が出せるならかなり画期的と言える。
後で裏どりする。
Translate: a Japanese contributor published a third party code of MCTS decoder for transformer.
This decode method was used in OpenAI’s o1-like model: Marco-o1.
github.com/Hajime-Y/rea...
Marco o1のモンテカルロ探索のコードはまだ公開されてないのだけど、個人で実装された方がいる。
github.com/Hajime-Y/rea...
I think absence of instant translation tool (like twitter have) will be a wall between cross lingual user communities.
Maybe I should post all my post in English. (I believe Japanese Kagglers would permit it.)
流石に英語で呟くか。日本人でもKaggleユーザならまぁ英語でも許容されそう。
翻訳ツールがないのが地味に痛い(言語の壁)
今の所マイナーなサービスに先行して集まった同志感がある
Kaggle Startar Packなるものを紹介された
go.bsky.app/VFijtNt
超役にたつコンテンツをBlueSkyのみで呟く人が何人か出てきたら流石にアカウントくらい作っとく気になる人が増えるのでは?
BlueSky、それなりに支持者を獲得するのか、徐々に廃れるのか今の所何とも判断がつかない。
大御所が(アカウントだけでなく)投稿先を次々移行したら簡単に民族大移動できそうだけど。