Ready for #NeurIPS2025!
Hit me up to chat about AI-driven discovery/optimization and evolutionary coding agents
Ready for #NeurIPS2025!
Hit me up to chat about AI-driven discovery/optimization and evolutionary coding agents
Excited to be in San Diego for #NeurIPS2025 next week to present our vision where models evolve continuously alongside the problems they solve π§¬
If you're into AI-driven discovery/optimization, self-improving agents, and open-endedness, let's connect!
We believe CAKE is just a slice of a bigger future where models evolve continuously alongside the problems they solve π§¬
Looking forward to presenting this work in San Diego this December!
π Paper: alphaxiv.org/abs/2509.179...
π» Code: github.com/richardcsuwa...
Beyond BO, CAKE is a universal framework for adaptive kernel design that can be easily extended to any other kernel-based methods, including:
π Support vector machines
π Kernel PCA
π Metric learning
Wherever kernels encode assumptions, CAKE can help them learn from context!
Our analysis also revealed that LLM-guided evolution consistently improve population fitness, significantly outperforming random recombination or traditional genetic algorithms
CAKE also excelled in the multi-objective setting:
- Achieved highest overall score and hypervolume for photonic chip design
- Demonstrated tenfold speedup in finding high-quality solutions
On 60 HPOBench tasks, CAKE demonstrated superior performance:
- Consistently achieved highest average test accuracy across all ML models
- Showed rapid early progress, achieving 67.5% of total improvement within 25% of the budget
1οΈβ£ How well the kernel explains the observed data (as measured by model fit)
2οΈβ£ How promising the kernelβs proposed next query point is (as measured by acquisition value)
π€ If we have a pool of kernels, which kernel should guide the next query?
We propose BIC-Acquisition Kernel Ranking (BAKER) π¨βπ³ to select the best kernel at each step by jointly optimizing two criteria:
CAKE works via an evolutionary process:
1οΈβ£ Initialize a population of base kernels
2οΈβ£ Score each kernel using a fitness function
3οΈβ£ Evolve kernels via LLM-driven crossover and mutation to generate new candidates
4οΈβ£ Select top-performing kernels for the next generation
Rather than committing to a fixed kernel, CAKE uses LLMs as intelligent genetic operators to dynamically evolve the kernel as more data is observed during the optimization process
π€ How do we design kernels that adapt to the observed data, especially when evaluations are expensive?
Our solution: Context-Aware Kernel Evolution (CAKE) π°
π€ How do we design kernels that adapt to the observed data, especially when evaluations are expensive?
Our solution: Context-Aware Kernel Evolution (CAKE) π°
The efficiency of BO depends critically on the choice of the GP kernel, which encodes structural assumptions of the underlying objective
β οΈ A poor kernel choice can lead to biased exploration, slow convergence, and suboptimal solutions!
Fresh out of the oven: CAKE is accepted at #NeurIPS2025! π
TL;DR: We introduce Context-Aware Kernel Evolution (CAKE) π°, an adaptive kernel design method that leverages LLMs as genetic operators to dynamically evolve Gaussian process (GP) kernels during Bayesian optimization (BO)
This shortcut worksβuntil we need breakthroughs. From robotics to drug discovery to aligning LLMs, real progress demands intelligent exploration.
I wrote a blog on why we need to re-center exploration in AI π
richardcsuwandi.github.io/blog/2025/ex...
Weβre training AI on everything that we know, but what about things that we donβt know?
At #ICML2025, the "Exploration in AI Today (EXAIT)" Workshop sparked a crucial conversation: as AI systems grow more powerful, they're relying less on genuine exploration and more on curated human data.
I wrote a blog post diving into the world of open-ended AI, exploring how embracing open-endedness might help us break the limits of todayβs AI systems π
richardcsuwandi.github.io/blog/2025/op...
From inventing new musical genres to imagining life beyond our universe, we continuously push the boundaries of whatβs possible.
What if AI could be as endlessly creative as humans or even nature itself?
Most AI systems today follow the same predictable pattern: they're built for specific tasks and optimized for objectives rather than exploration.
Meanwhile, humans are an open-ended speciesβdriven by curiosity and constantly questioning the unknown.
They found that if an AI agent can tackle complex, long-horizon tasks, it must have learned an internal world modelβand we can even extract it just by observing the agent's behavior.
I wrote a blog post unpacking this groundbreaking paper and what it means for the future of AGI π
2 years ago, Ilya Sutskever made a bold prediction that large neural networks are learning world models through text π
Recently, a new paper by Google DeepMind provided a compelling insight to this idea.
AI that can improve itself: A deep dive into self-improving AI and the Darwin-GΓΆdel Machine.
richardcsuwandi.github.io/blog/2025/dgm/
Excellent blog post by Richard Suwandi reviewing the Darwin GΓΆdel Machine (DGM) and future implications.
A deep dive into self-improving AI and the Darwin-GΓΆdel Machine
https://richardcsuwandi.github.io/blog/2025/dgm/
https://news.ycombinator.com/item?id=44174856
But what if AI could learn and improve its own capabilities without human intervention? I wrote a blog post to explore this concept further and examine what it could mean for the future of AIπ
richardcsuwandi.github.io/blog/2025/dgm/
This is the Achilles heel of modern AI β like a car, no matter how well the engine is tuned and how skilled the driver is, it cannot change its body structure or engine type to adapt to a new track on its own.
Most AI systems today are stuck in a "cage" designed by humans.
They rely on fixed architectures crafted by engineers and lack the ability to evolve autonomously over time.
Feel free to check out the full paper here: ieeexplore.ieee.org/abstract/doc...
or on arXiv: arxiv.org/abs/2309.08201
We further present theoretical convergence guarantees for the learning framework, along with extensive experiments showcasing the superior prediction performance and efficiency of our proposed methods.
Our proposed kernel significantly reduces the number of hyper-parameters for optimization while maintaining good approximation capabilities, and our distributed learning framework improves training efficiency and data privacy through parallelization and collaborative learning.