_ - \. (@crumb)

my internet got quick enough for long enough to watch a video on youtube in 720p and that was exciting for me today

10.03.2026 18:07 👍 1 🔁 0 💬 0 📌 0

this is just a factory game

09.03.2026 03:18 👍 3 🔁 0 💬 0 📌 0

i love that i get to be autistic about this

08.03.2026 23:11 👍 0 🔁 0 💬 0 📌 0

ahh im so excited im having so much fun

08.03.2026 22:16 👍 0 🔁 0 💬 1 📌 0

i said this an hour ago and the pipeline is already finished and working and the first run for the sweep of the non reasoning discriminator for the updated data needed for it isnt even done

08.03.2026 21:34 👍 0 🔁 0 💬 0 📌 0

genuinely playing toys

08.03.2026 21:28 👍 0 🔁 0 💬 0 📌 0

im also hoping with the shared backbone between the generator and discriminator in reasoning mode, after enough RL, the generator will be able to re-use the discriminator's patterns of thought to self-reflect and iterate during reasoning

08.03.2026 20:41 👍 1 🔁 0 💬 0 📌 0

i'll also be running some "normal" benchmarks here with this for the next tech report

08.03.2026 20:35 👍 0 🔁 0 💬 0 📌 0

guys i love data engineering

08.03.2026 20:31 👍 1 🔁 0 💬 1 📌 1

i am bootstrapping "general simulator that knows when to spend what amount of compute to simulate the thing i ask" from these assistants and it is annoying but it's also really fun it's like playing toys

08.03.2026 20:31 👍 3 🔁 0 💬 1 📌 0

i am just training models so i can do data engineering at this point. i am just training the data

08.03.2026 20:29 👍 0 🔁 0 💬 0 📌 0

idea here is that eventually i'll be able to let it decide when to use a scratchpad autonomously and train that behavior with RL

08.03.2026 20:18 👍 0 🔁 0 💬 0 📌 0

also updated prompt revision

08.03.2026 20:17 👍 1 🔁 0 💬 2 📌 0

the stage after this (clmr-3) will be using a reasoning discriminator instead, i have a complex synthetic data pipeline in mind for the warm start that should be pretty baller

08.03.2026 20:16 👍 0 🔁 0 💬 1 📌 1

im beefing the discriminator in my setup up with some bidirectional transformer blocks (maybe 8? for 0.5B extra params?) so that i can have a value function baseline for the generator that isn't as powerful as it

08.03.2026 20:15 👍 0 🔁 0 💬 1 📌 0

yea shit rocks for finetuning

08.03.2026 02:45 👍 0 🔁 0 💬 0 📌 0

ok im having fun again

06.03.2026 05:43 👍 0 🔁 0 💬 0 📌 0

i hope i can just keep pushing this and get a general simulator, just a base model but + test time compute seems like exactly what i would like to have right now

06.03.2026 05:35 👍 1 🔁 0 💬 0 📌 0

i am an allen ai fangirl

06.03.2026 05:14 👍 0 🔁 0 💬 0 📌 0

also cleaning up prompt w/ synth data -w-

05.03.2026 20:11 👍 1 🔁 0 💬 0 📌 1

im gonna attempt after i finish with the next clmr

04.03.2026 23:29 👍 0 🔁 0 💬 0 📌 0

you can just make that setup right now programmatically generating dynamic systems to influence

04.03.2026 23:29 👍 0 🔁 0 💬 1 📌 0

you can set the meta reasoning interval to every like 8 tokens for a huge inflation of test time compute

04.03.2026 23:10 👍 0 🔁 0 💬 0 📌 0

what i'm doing now is a text simulator with reasoning, the text it is simulating can be a reasoning chain! i feel like an idiot for not realizing this sooner

04.03.2026 23:09 👍 0 🔁 0 💬 1 📌 1

i love coding, i won't delegate that to agents. i gain such detailed maps of my mental acting as translator between high dimensional concepts and a computer, in the discrepancies between those two spaces i learn more about me. with agents you deal with a much more noisy signal

04.03.2026 22:43 👍 0 🔁 0 💬 0 📌 0

oh fuck

04.03.2026 22:21 👍 2 🔁 0 💬 0 📌 0

rn this
bsky.app/profile/crum...

04.03.2026 01:42 👍 1 🔁 0 💬 0 📌 0

i dont like the reasoning model but if you have a corpus of your own reasoning data it's easy to just train the base model and then let it rip in your RL

04.03.2026 01:42 👍 0 🔁 0 💬 0 📌 0

this thing is fuckin speedy

04.03.2026 00:29 👍 4 🔁 1 💬 1 📌 0

im really happy there's a 2b version of the latest qwen.. im gonna do so much with that

03.03.2026 21:06 👍 2 🔁 0 💬 1 📌 3

_ - \.

Latest posts by _ - \. @crumb