I thought this was an interesting graphic
I thought this was an interesting graphic
They did it all without Jira... Amazing
the mean of a distribution is the point that minimizes the average squared difference of points drawn from that distribution
I've never thought of mean as an argmin before, but it's a neat framing!
The top rated iclr paper (relight) is amazing.
Ic light 2 is also out on GitHub
github.com/lllyasviel/I...
Based on the flux suite of models and has stunning results
good workflow
prompt r1-preview -> refine
copy all reasoning traces to claude -> prompt again
copy output and original prompt to o1-preview -> verify
This essentially solves every problem I've thrown at it from linguistic to mathematic.
Genmo has released LoRA training capabilities for their generative video model Mochi
github.com/genmoai/moch...
Trains quickly on a single 80GB GPU.
I am anxious to get my hands on r1 and grok 3.
I've heard some big moves are coming first two weeks of December from oai, Anthropic, and Gemini - but I'm more excited about these other two.
They feel meaningfully orthogonal from approach and group dynamics
Yeah, I think that's because Gemini live uses Gemini flash, which is a weaker underlying model
Gemini live is essentially just as good as advanced voice from oai. And no one is talking about either
This is awesome stuff
Hopefully I'll have a little 4 page thing up soon-ish, holiday project
Wasserstein Expectation Maximization! Using OT distance in the M step and then a convergence proof
I've been noodling on a math problem since 2018 or so. I think I finally cracked it after a couple hours with r1-lite
Cool new paper from NVIDIA about a hybrid state space + attention model that performs extremely well as a small model. Their 1.5B model even out performs Llama 3.2 3B
arxiv: arxiv.org/abs/2411.13676
Great list!!
๐
Inference Scaling Laws of DeepSeek-R1-Lite-Preview
Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score improvements on AIME as thought length increases.
Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!
DeepSeek-R1-Lite-Preview is deepseeks answer to o1.
๐ o1-preview-level performance on AIME & MATH benchmarks.
๐ก Transparent thought process in real-time.
๐ ๏ธ Open-source models & API coming soon!
๐ Try it now at chat.deepseek.com
Fun probability fact, the likelihood that two randomly drawn numbers are coprime is 61%!
I have nothing to say. Just enjoy this validation loss curve for a moment
Where are my AI friends at?
Are there turn key machine shops?
Just pay $ and get an automated, garage sized, workshop?
When deep learning start ups exit:
Marble floors in Monaco glass
Wrist so frozen, yeah it's built to last
Future vision through a tinted mask
Private hangars where I count my stash
Every move calculated like math
Pull up in that Phantom, tinted glass
Stack them queries deep with this KV cash
my favorite phrase to hear when interviewing scientists
"and this is the point where I would ask claude ..."
Hello world