Trending

#Reinforcementlearning

Latest posts tagged with #Reinforcementlearning on Bluesky

Latest Top
Trending

Posts tagged #Reinforcementlearning

Preview
These aren’t AI firms, they’re defense contractors. We can’t let them hide behind their models From Gaza to Iran, the pattern is the same: precision weapons, chosen blindness, and dead children. The cost of failing to regulate AI warfare is already too high

AI warfare's cost is high with precision weapons, chosen blindness, and civilian casualties. The 'fog procedure' exemplifies this dangerous trend.
www.theguardian.com/us-news/ng-interactive/2...
#AI #AIethics #MachineLearning #ReinforcementLearning #AIS...

1 0 0 0
Video

✨Two single author papers accepted to ICLR 2026!✨

Truly excited to present these results at #ICLR2026 !
@iclr-conf.bsky.social #ICLR26 #ReinforcementLearning

0 1 0 0
Post image

🚀 Google discovered:

AI agents learn to COOPERATE on their own when trained against diverse and unpredictable opponents!

#AI #GoogleAI #MultiAgent #ReinforcementLearning #LLM #AISystems

1 0 0 0

🚀 Google discovered:

AI agents learn to COOPERATE on their own when trained against diverse and unpredictable opponents!

#AI #GoogleAI #MultiAgent #ReinforcementLearning #LLM #AISystems

1 0 0 0
Post image

Google’s new research shows AI agents can team up and outsmart unpredictable opponents using standard RL and decentralized training. Curious how GRPO drives cooperative strategies? Dive in! #AIAgents #ReinforcementLearning #MultiAgentLearning

🔗 aidailypost.com/news/google-...

0 0 0 0
Preview
16 Open-Source RL Libraries, One Shared GPU Bottleneck A Hugging Face survey of 16 open-source reinforcement learning libraries finds the entire ecosystem has converged on async disaggregated training to fix a single brutal bottleneck: GPU idle time during long rollouts.

16 Open-Source RL Libraries, One Shared GPU Bottleneck

awesomeagents.ai/news/huggingface-async-r...

#HuggingFace #ReinforcementLearning #OpenSource

1 0 0 0
Image

Image

I discovered this thought-provoking paper about RoboPocket - a new way to boost robot learning with real-time feedback from your phone. No fancy gear needed! See link below. #robotics #reinforcementlearning #humantech
https://arxiv.org/abs/2603.05504

0 0 0 0

🚀 Check out "The AI That Learned to Play with Itself" — researchers let a neural network play a game against copies of itself! 🤖💥 It discovered strategies humans hadn’t thought of! Talk about self-improvement! 🔄 #AI #ReinforcementLearning #MindBlown

3 0 0 0

winbuzzer.com/2026/03/05/d...

New Databricks KARL RAG Agent Promises 33% Cost Reduction vs. Claude Opus 4.6

#AI #Databricks #DatabricksKARL #Anthropic #Claude #GenerativeAI #MachineLearning #AIAgents #EnterpriseAI #RAG #KARL #ReinforcementLearning

0 0 0 0
Preview
OpenAI VP Joins Anthropic After Pentagon Deal Backlash OpenAI VP Max Schwarzer has joined Anthropic, citing trusted colleagues and shared values, hours after backlash over OpenAI's Pentagon military AI deal.

winbuzzer.com/2026/03/06/o...

OpenAI's Post Training Lead Max Schwarzer Joins Anthropic After Pentagon Deal Backlash

#AI #ChatGPT #Anthropic #Claude #OpenAI #MaxSchwarzer #Pentagon #ReinforcementLearning

0 0 0 0
Post image

Richard S. #KünstlicheIntelligenz #LernenausErfahrung #ReinforcementLearning #RichardSutton #Sprachmodelle
wahnsinnwissen.de/?p=1124

0 0 0 0
Preview
OpenClaw-RL Lets You Train a Personal AI Agent Just by Talking to It Gen-Verse's new open-source framework uses asynchronous reinforcement learning to personalize LLMs through natural conversation - no labeling, no datasets, just feedback.

OpenClaw-RL Lets You Train a Personal AI Agent Just by Talking to It

awesomeagents.ai/news/openclaw-rl-persona...

#Openclaw #ReinforcementLearning #OpenSource

2 0 2 0
Post image

✨Two single author papers accepted to ICLR 2026!✨

Truly excited to present these results at #ICLR2026 !

@iclr-conf.bsky.social #ICLR26 #DeepRL #ICLR #ReinforcementLearning

0 0 0 0

ADD uses diffusion + regret guidance to close the RL generalization gap.

An Environment Critic + CVaR makes the signal differentiable.

Result: 85% solved in Minigrid (+18% over SOTA).

ARC is ongoing — join us next time.

#MachineLearning #ReinforcementLearning #NeurIPS #ASU

0 0 0 0
Preview
Mini: Rock, Paper, Scissors Rock, Paper, Scissors, Shoot! This 5-second two-player game to settle disputes began in ancient China and quickly spread throughout the world. Some research has also attempted to use game theory to understand decisions in this game and were surprised by the results, but we weren't! You'll see why.  Join our supporters' club: www.patreon.com/wwdwwpodcast Links and References:  - https://www.annarahmanan.com/the-history-of-rock-paper-scissors-game - https://www.playworks.org/game-library/ro-sham-bo-or-rock-paper-scissors/ - https://www.tandfonline.com/doi/abs/10.1080/00107514.2015.1026556

📣 New Podcast! "Mini: Rock, Paper, Scissors" on @Spreaker #ancientchina #cyclicalcompetition #dei #gametheory #learningtheory #operantlearning #psychology #reinforcementlearning #rockpaperscissors #roshambo #rps #science #shoushling #skepticism #whywedowhatwedo #wwdwwdpodcast

0 0 0 0
Video

More improvements to my AI locomotion. This time I trained it using a randomly bumpy terrain, random variation on the robot weight etc. The next step is, testing it on the real robot!

#robot #machinelearning #reinforcementlearning

1 0 0 0

Spent the weekend trying to learn about #ReinforcementLearning by training an agent to play Xs & Os / Tic-Tac-Toe.
An unexpected side effect is that by playing dozens of games of Xs & Os over a 48-hour period, I have Stockholm syndromed myself into believing that it is the greatest game of all time

1 0 1 1
Post image

Boundary and handshake between Philosophy of Science, on one hand, and Science and Engineering (Geometric Manifold Rectification), on the other hand: Testing Bridge360 Metatheory Model v20.4 Handshake Version

agericomontecillodevilla.substack.com/p/boundary-a...

#ReinforcementLearning

0 0 0 0
Preview
Building a Production-Ready Reinforcement Learning System for Smart Energy Management in Sustainable

A production-ready reinforcement learning system for smart energy management, optimizing building energy consumption while maintaining occupant comfort. #reinforcementlearning

0 0 0 0
Post image

Advisory to developers to cut RL time and reduce “megadata” dependence: Embedding Bridge360 Metatheory Model

#ReinforcementLearning
#MachineLearning

agericomontecillodevilla.substack.com/p/advisory-t...

0 0 0 0
Video

Finally got my RL trained policy working in sim! This was trained using behavior cloning (from a manually constructed policy) followed by PPO. This video shows me using a gamepad to control the robot. The neural net is ran using Rust's ndarray library.

#rust #ndarray #robots #reinforcementlearning

0 0 1 0
Preview
Quando i robot imparano a scegliere: come l’intelligenza artificiale migliora la navigazione e le prestazioni operative - Digitalmente I robot stanno progressivamente uscendo dai laboratori per entrare negli spazi quotidiani: fabbriche, ospedali, magazzini, città intelligenti. Per operare in questi ambienti complessi e mutevoli, non ...

Quando i robot imparano a scegliere: come l’intelligenza artificiale migliora la navigazione e le prestazioni operative

www.digitalmente.cloud/2026/02/17/r...

#RoboticaIntelligente, #IntelligenzaArtificiale, #DeepLearning, #ReinforcementLearning, #RobotAutonomi, #AIResearch, #Automazione

0 0 0 0

winbuzzer.com/2026/02/13/m...

MiniMax M2.5: Open-Source AI "Matches" Claude Opus at 1/20th Cost

#AI #MiniMax #MiniMaxM25 #OpenSourceAI #ChinaAI #MixtureOfExperts #MachineLearning #AIModels #ReinforcementLearning

1 1 0 0
Post image

#Term: #ReinforcementLearning (#Rl)

"Reinforcement Learning (RL) is a #MachineLearning method where an agent learns optimal behavior through trial-and-error interactions with an environment, aiming to maximize a cumulative #Reward signal over time." - Reinforcement Le...

https://with.ga/qvxm5

1 0 0 0
Post image

#Term: #ReinforcementLearning (#Rl)

"Reinforcement Learning (RL) is a #MachineLearning method where an agent learns optimal behavior through trial-and-error interactions with an environment, aiming to maximize a cumulative #Reward signal over time." - Reinforcement Le...

https://with.ga/qvxm5

1 0 0 0

🧵[10/11]

If you're working on RL, MINTO is a simple modification that can make your training faster and more stable.

📄 Paper: arxiv.org/pdf/2510.02590
💻 Code: github.com/AhmedMagdyHe...
🌐 Website: minto.ahmedhendawy.de

🤝 Happy to discuss!

#ReinforcementLearning #ICLR2026 #DeepLearning

2 0 1 0
Elements of Reinforcement learning

Elements of Reinforcement learning

Dropped Intro to Reinforcement Learning, @pluralsight.bsky.social course on the fundamentals of the ML technique used to train LLMs, autonomous systems, etc. My master’s thesis was on ML, so this one feels special. Link in comments. #reinforcementlearning #machinelearning #llm #ai #softwaredev

0 0 1 0
Video

Electric Atlas is insane 😳🦿
80–90 kg of pure control, powered by reinforcement learning at Boston Dynamics.
Gymnastics research → real factory deployment with Hyundai & DeepMind this year.
Future is moving.

#GroupifyAI #AI #Robotics #BostonDynamics #ReinforcementLearning #DeepMind #Automation

0 0 0 0
Post image

In this #InfoQ article, Hina Gandhi explores a #ReinforcementLearning (RL) approach built on #ApacheSpark, enabling distributed computing systems to autonomously learn optimal configurations.

📰 Read now: bit.ly/4thGGAf

#AI #bigdata #database #AIagents #InfoQ

1 1 0 0
Preview
Reinforcement Learning on Non-Euclidean Spaces: Swarms, Spheres, and Hyperbolic RL

Learn about stochastic policies using Bingham, spherical Cauchy, and hyperbolic latent representations. #reinforcementlearning

0 0 0 0