New post! Solar storms are damaging and expensive, are a tail risk for catastrophic harm, and can be averted straightforwardly and cheaply (only we haven't done so).
www.lesswrong.com/posts/ghq9Ew...
New post! Solar storms are damaging and expensive, are a tail risk for catastrophic harm, and can be averted straightforwardly and cheaply (only we haven't done so).
www.lesswrong.com/posts/ghq9Ew...
The race to develop frontier AI is accelerating faster than safeguards can keep up, posing major risks to democracy and our societies.
I’m proud to add my voice to the growing movement of experts and organizations calling for a safer, more intentional path forward with AI.
humanstatement.org
Montréal AI safety, ethics, and governance newsletter, March 2026 edition
- Intl. AI Safety Report: risk mgmt still voluntary
- 5 Montréal AI safety events this month
- CIFAR puts $1M toward alignment research
- Local papers on interpretability & hallucinations
aisafetymontreal.org/newsletter/2...
Within the next year we will have superforecaster-level AI. Their predictions would spread in the news, policy, planning, markets. But LLMs are highly correlated, so their shared biases and correlated failures like systematic overconfidence would propagate further into our collective epistemics.
Commentary: Anyone Else Have Those Weird Dreams Where Sobbing Future Generations Beg You To Change Course?
Ran the Qwen 3.5 MoE family (3B–17B active params) on 155 recent prediction questions from ForecastBench. All are not well calibrated: overconfident when predicting near 100%, and many predictions clustered around 50% (hedging/low sharpness).
In 2006, DARPA had a research program (HI-MEMS) on implanting electrodes into insects during metamorphosis, so developing tissue would integrate them, to control their locomotion remotely.
Another approach which may be cleaner is using t-of-n threshold cryptography, where the PDS is one of the n shareholders but can never meet the threshold alone. Whenever a user wants to write to the PDS, their device co-signs.
FROST does this and is a standard as of 2024 in RFC 9591.
An active user might do hundreds+ signed commits to a PDS in a session (post, reply, liking, following, etc).
Self-hosting a PDS is inconvenient and unreliable relative to using specialized hosting services.
A path forward may be *short-lived delegated signing keys*, with user owning root keys.
The AI public benefit corporations do have humanity as their stated duty. Unfortunately, what they actually target is "what is tolerable by American law".
All the other AI companies are traditional corporations, which structurally do not even target the public benefit.
There was UN Secretary-General's High-level Advisory Body on Artificial Intelligence, established in 2023 with members from 33 countries, which released its final report "Governing AI for Humanity" in September 2024.
Its first recommendation was the creation of this Scientific Panel.
We need international red lines to prevent unacceptable AI risks.
Ban AI towards lethal autonomous weapons, mass surveillance, nuclear command & control, bioweapon assistance, unsupervised control of critical infrastructure, disinformation, CSAM, social scoring, and recursive self-improvement R&D.
Out of curiosity I asked Claude Opus about contemporary techniques vs this problem space. It created this web app comparing different methods claude.ai/public/artif... which you may find interesting
early physics of the mind fire
The infamous METR graph is going vertical.
Current trends suggested ~8h-9h time horizons but instead we're seeing ~14.5h time horizons!
Based on this, I would project ~2-3.5 workweek time horizons by end of year (!!). That could have significant implications for the economy.
Guaranteed Safe AI Seminars, March 2026:
Benchmarks for AI-assisted Formal Verification
By Theodore Ehrenborg, AI Safety researcher at the Beneficial AI Foundation and PIBBSS
Thursday, March 12, 1PM EST
luma.com/nk8ce7so
Montréal AI safety event, Tuesday March 3rd, 7 PM:
When Is a Human Actually “Overseeing” an AI System?
By @shalalehrismani.bsky.social postdoc at McGill+Mila, working on system safety, HCI, and the societal impact of AI, and executive director of the Open Roboethics Institute.
luma.com/7kugvplz
Montréal AI safety event, Tuesday Feb 24, 7 PM:
Rights Balancing: How the Future Rights of AI Workers will also Protect Human Rights
By Jonathan Simon assist. prof. at Philosophy UdeM and
Heather Alexander, human rights lawyer. Co-founders of @futureofcit.bsky.social.
luma.com/hcrp5nmu
in web browser: code.claude.com/docs/en/chrome,
for dev: platform.claude.com/docs/en/agen...
TIL macOS SSH has post-quantum key exchange support but doesn't use it by default. Instead, it prefers `ecdh-sha2-nistp256` which is not PQ.
so, use ssh or wireguard with psk, at least until github.com/tailscale/ta... is done.
OpenSSH 9.0+ defaults to post-quantum key exchange since 2022. WireGuard supports PQ via PSK but it's off by default. Tailscale has the control plane to support this but doesn't. Probably most of WireGuard traffic is subject to harvest-now-decrypt later.
What do you think is the roadmap for Bluesky towards having Community Notes integrated in Bluesky officially and successfully?
At LawZero, we're rethinking the building blocks of frontier AI to create an intelligent machine that is both highly capable and safe-by-design. We’re excited to share our first blog post outlining some of the objectives and core components of our Scientist AI project. 🧵
(1/4)
The International AI Safety Report 2026 was launched today. Led by @yoshuabengio.bsky.social, the report is offering the most comprehensive evidence-based assessment to date of AI capabilities, emerging risks, and safety measures.
Montréal AI safety event, Tuesday Feb 17, 7 PM:
What hackers talk about when they talk about AI: Early-stage diffusion of a cybercrime innovation
Talk by Benoît Dupont, Chair in Cyber-resilience, Human-Centric Cybersecurity Partnership director, and Criminology prof at UdeM.
luma.com/gifbf18i
For the Montréal AI safety community: let's meet on some Fridays for coworking + bouldering, starting this week.
Pour la communauté montréalaise de la sûreté de l'IA : retrouvons-nous certains vendredis pour du coworking et de l’escalade de bloc, commençant cette semaine.
luma.com/8nztwanh
Lancement de l'infolettre Montréal AI safety, ethics, governance. Mensuelle, sur les événements, opportunités, politique et recherche.
Launching the Montréal AI safety, ethics, governance newsletter. Monthly on events, opportunities, policy, research.
newsletter.aisafetymontreal.org/fevrier-2026/
Meanwhile Europe has a feasibility study ascend-horizon.eu/data-centres...