Alasdair Paren (@alasdair-p)

www.scientificamerican.com/article/hack...

New article by Deni Bechard at Scientific America covering our work on hijacking Multimodal computer agents published on Arxiv earlier this year. A massive effort by Lukas Aichberger, supported by myself Yarin Gal, Philip Torr, FREng, FRS & Adel Bibi

04.09.2025 15:32 👍 1 🔁 1 💬 0 📌 0

eurips.cc A NeurIPS-endorsed conference in Europe held in Copenhagen, Denmark

NeurIPS is endorsing EurIPS, an independently-organized meeting which will offer researchers an opportunity to additionally present NeurIPS work in Europe concurrently with NeurIPS.

Read more in our blog post and on the EurIPS website:
blog.neurips.cc/2025/07/16/n...
eurips.cc

16.07.2025 22:05 👍 124 🔁 38 💬 2 📌 3

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their steps (CoT) aren't necessarily revealing their true reasoning. Spoiler: the transparency can be an illusion. (1/9) 🧵

01.07.2025 15:41 👍 83 🔁 31 💬 2 📌 5

AI is becoming dangerous. Are we ready? YouTube video by Sabine Hossenfelder

Not every day you see a paper you worked on featured by a youtube channel you've watched before :) youtu.be/KY7_ufxh_Rk?...

10.06.2025 17:52 👍 2 🔁 0 💬 0 📌 0

Shh, don't say that! Domain Certification in LLMs Domain Certification - A novel framework providing provable, adversarial defenses for LLMs safety.

Read more: cemde.github.io/Domain-Certi...

Thanks to my amazing collaborators:
- @alasdair-p.bsky.social, Preetham Arvind, @maximek3.bsky.social, Tom Rainforth, @philiptorr.bsky.social, @adelbibi.bsky.social at @ox.ac.uk
- Bernard Ghanem at KAUST
- Thomas Lukasiewicz at @tuwien.at.

(7/7)

04.04.2025 20:11 👍 3 🔁 2 💬 0 📌 0

⚠️ Beware: Your AI assistant could be hijacked just by encountering a malicious image online!

Our latest research exposes critical security risks in AI assistants. An attacker can hijack them by simply posting an image on social media and waiting for it to be captured. [1/6] 🧵

18.03.2025 18:25 👍 8 🔁 8 💬 1 📌 3

Do we NEED International Collaboration for Safe AGI? Insights from Top AI Pioneers | IIA Davos 2025 YouTube video by Imagination in Action

A few weeks ago in Davos, Demis Hassabis highlighted the need to develop a "CERN for AGI" to ensure that advances at frontier level remain safe. I totally agree with him: We need this kind of international cooperation. youtu.be/U7t02Q6zfdc?...

19.02.2025 18:10 👍 27 🔁 3 💬 0 📌 0

Shh, don't say that! Domain Certification in LLMs Foundation language models, such as LLama, are often deployed in constrained environments. For instance, a customer support bot may utilize a large language model (LLM) as its backbone due to the...

The amazing collaborators: Preetham Arvind, @alasdair-p.bsky.social, Maxime Kayser, Tom Rainforth, Thomas Lukasiewicz, Philip Torr, Adel Bibi.

A @oxfordtvg.bsky.social production.

(6/6)

Link to paper:
openreview.net/forum?id=brD...

14.12.2024 01:18 👍 3 🔁 1 💬 0 📌 0

Alasdair Paren

Latest posts by Alasdair Paren @alasdair-p