For the past four years, I’ve been working on a topic that’s both fascinating and challenging to explain. In this post, I’ve tried to present The Operator Way — a paradigm for understanding dynamical processes — in plain, approachable terms.
pietronvll.github.io/the-operator...
09.01.2025 17:19
👍 7
🔁 0
💬 1
📌 1
By the time I finished working on this paper, I had more research questions than when I started. I take this fertility of ideas as a very good sign 😃. If you’re in Vancouver, consider checking it out. I’ll be at the West Ballroom A-D from 16:30 to 19:30, poster #6907
12.12.2024 17:19
👍 2
🔁 0
💬 0
📌 0
To add some flesh around this core idea, we developed a neat theoretical foundation that combines conditional mean embeddings and policy mirror descent. This foundation ultimately leads to sample complexity results, highlighting the interplay between exploration and exploitation.
12.12.2024 17:19
👍 2
🔁 0
💬 1
📌 0
The return is a (conditional) expected value, and we realized that there are now mature ML tools to model such expected values directly, avoiding the solution of intermediate and more difficult problems.
12.12.2024 17:19
👍 2
🔁 0
💬 1
📌 0
So, what’s all this fuss about? Reinforcement learning, in essence, is an optimization problem: we want to maximize returns.
12.12.2024 17:19
👍 2
🔁 0
💬 1
📌 0
In his book “The Nature of Statistical Learning” V. Vapnik wrote:
“When solving a given problem, try to avoid a more general problem as an intermediate step”
12.12.2024 17:19
👍 8
🔁 3
💬 1
📌 0
Come check it out!!
10.12.2024 02:42
👍 2
🔁 1
💬 0
📌 0