arXiv: arxiv.org/abs/2603.107...
arXiv: arxiv.org/abs/2603.107...
This paper showed up in my Paperzilla "Fast diffusion via flows/consistency" feed: paperzilla.ai/digest/ddce3...
Overall architecture of AlphaFlowTSE. Given a mixture waveform y and an enrollment utterance e, we compute complex STFT features and form the mixture feature Y and enrollment feature E (real/imaginary concatenation). During training, the backbone takes the current state feature zt; during inference we initialize z0 = Y . The enrollment feature is concatenated as a temporal prefix, yielding [Eβ₯zt] (or [Eβ₯z0] at inference), which is fed to the UDiT backbone. The backbone is conditioned via AdaLN on the absolute time t and the interval length β = r β t (with r = 1 at inference), and predicts the mean velocity for finite-interval transport, denoted uΞΈ(t, r, [Eβ₯zt]). One-step inference (NFE= 1) produces an estimated complex STFT SΛ = (SΛRe, SΛIm), which is converted to the target waveform sΛ by iSTFT. The dashed module is an optional mixing-ratio predictor used only in the background-to-target ablation to predict the start coordinate
Imagine a noisy group call where 3 people talk at once.
This paper builds a model that can focus on a single speaker (using a short voice sample) and extract that voice.
This cleanup results in better, faster audio transcription.
Summary and full paper π
#AudioML #SpeechToText
This paper showed up in my Paperzilla #openclaw feed: paperzilla.ai/digest/95413...
Paradigm Shift from GUI-Based Operating Systems to AgentOS with Multi-Agent Orchestration and Natural Language Interface.
Another fun paper showed up in my feed. A theoretical proposal for an OS for AI agents.
No more GUI, enter the NLDUI: the natural language-driven interface.
Details below π
And the docs are here: docs.paperzilla.ai/guides/feed-...
News article with all the details: paperzilla.ai/news/better-...
Screenshot of the Paperzilla paper feed workflow
Better research paper recommendations start with explicit user feedback.
So that is what we are building with Paperzilla.
Today, we launch the first phase of that. Read more in the article below.
arXiv: arxiv.org/abs/2602.242...
Source code: github.com/Axiomatic-AI...
This paper showed up in my Paperzilla #AI4Math feed:
paperzilla.ai/digest/0fbbf...
Minimal agent for theorem proving. The design focuses on three main aspects: iterative proof refinement, memory system, and access to tools. A proposer agent writes Lean code to prove a given theorem. Then, a compiler verifies if the proposed proof works. If so, it is double-checked by a reviewer agent to prevent any form of cheating. If the code does not compile, or the reviewer objects, the feedback is sent to the memory module, and the cycle starts over with the proposer refining its previous proof. In addition, the proposer can be given access to tools such as library search or web search, and call them a fixed number of times before giving its proposal.
As Karpathy's autoresearch takes the spotlight, here's a comparable lightweight agent setup that overperforms via feedback loops and readily available LLMs:
ax-prover, a minimal agent for automated theorem proving.
Source code and paper π
Resources:
arXiv: arxiv.org/abs/2603.04454
Github repo: github.com/mmajurski/lm...
This paper showed up in my Paperzilla "RAG, Retrieval, and Semantic search" feed: paperzilla.ai/digest/d530c...
When RAG systems surface relevant information, LM performance can be enhanced by rewriting the initial query using contextβadded information that, without providing the answer, gives relevant background knowledge and direction.
New paper shows that rewriting a question using retrieved answer-free context into a clearer question before the final LLM call improves accuracy a lot.
In practice, that means you can offload rewrite to a smaller, cheaper model and get better and cheaper results.
Paper + resources π
PS, let me know what preprint/open sources you are interested in, and I can add them
Please have a look at the alternative for Google scholar alerts I've built. It reduces the amount of papers for you to evaluate significantly. Can be consumed as email digests, RSS feed, or via API/MCP. paperzilla.ai
Paper: arxiv.org/abs/2603.014...
Repo: github.com/leonardodali...
Demo: huggingface.co/spaces/AI4Re...
Paperzilla found this paper for me, here's the summary: paperzilla.ai/digest/9cee0...
13 domain experts (PhDs, professors, industry researchers) rated SciDER 4.85/5 for "helpfulness" in reducing workflow and achieving data-grounded accuracy.
Also: An astrophysicist used SciDER to analyze the Kepler Exoplanet Dataset and achieved 98% F1 score on exoplanet detection.
Read on π
SciDER: Scientific Data-centric End-to-end Researcher. Flowchart of expert-based and LLM-based research lifecycle.
This project (paper, demo, and open-source repo) seems like a promising step toward having an AI scientist on your team.
The framework, SciDER, actually does science. Data β experiments β results. End-to-end.
Read on π
#ai4science
Just give it a try: paperzilla.ai and let me know if it is useful for you. If not, let me know as well :)
In Paperzilla, it starts with the user's expression of their research topic. That results in a project that covers categories from multiple sources (e.g. both ChemRxiv and arXiv). The output is a feed that will be consumed by OpenClaw (or another agent). It will work with a narrow relevant context.
paperzilla.ai/news/chemrxi... cc @openclaw-x.bsky.social
ChemRxiv coverage and a Paperzilla skill for OpenClaw
The weekly Paperzilla improvements are here: ChemRxiv coverage in beta and a Paperzilla skill for OpenClaw
Worlds are colliding!
Link to full news item in comment π
arXiv: arxiv.org/abs/2602.233...
Paperzilla found this paper for me and created this summary: paperzilla.ai/digest/df041...
They found that AI agents using simple keyword search tools can answer questions nearly as well as complex, expensive vector databases. So, cheaper, easier to maintain and nearly as good.
Paper & summary in the comments π
Comparison between RAG (red) and agent-based (blue) pipelines for document QnA
A while ago I found a paper that showed that BM25 search sometimes beats RAG. Here is another paper, by @awscloud.bsky.social, showing that agentic keyword search also often is the better choice.
Read on π
Scholar-Skill: 21 Skills Organized by Research Stage
New interesting paper: a scientist at @stonybrooku.bsky.social created scholar-skill, a 21-skill plugin for claude code. He used this to automate the social science pipeline. Pretty cool! Read the paper below π