Flavio Calmon (@fcalmon)

New paper on discretion in AI “alignment” — check out @maartenbuyl.bsky.social’s thread below!

20.02.2025 01:42 👍 6 🔁 0 💬 0 📌 0

AI Alignment at Your Discretion In AI alignment, extensive latitude must be granted to annotators, either human or algorithmic, to judge which model outputs are `better' or `safer.' We refer to this latitude as alignment discretion....

9/n Full paper here: 🔗 arxiv.org/abs/2502.10441. Huge thanks to my amazing team of co-authors: @hadikh.bsky.social, @lucasmpaes.bsky.social, @claudiomv.bsky.social, @caiocvm.bsky.social, and @fcalmon.bsky.social. Done at @harvard.edu

19.02.2025 21:10 👍 3 🔁 1 💬 0 📌 0

AI is built to “be helpful” or “avoid harm”, but which principles should it prioritize and when? We call this alignment discretion. As Asimov's stories show: balancing such principles for AI behavior is tricky. In fact, we find that AI has its own set of priorities. (comic by @xkcd.com)🧵👇

19.02.2025 21:10 👍 5 🔁 3 💬 2 📌 2

Attack-Aware Noise Calibration for Differential Privacy Differential privacy (DP) is a widely used approach for mitigating privacy risks when training machine learning models on sensitive data. DP mechanisms add noise during training to limit the risk of i...

The standard practice in differential privacy of targeting ε at small δ is extremely lossy for interpreting the level of privacy protection. For many real-world algorithms (e.g., for DP-SGD), we can do much better!

We show how in the #NeurIPS2024 paper:
arxiv.org/abs/2407.02191

Short summary👇

10.12.2024 03:11 👍 9 🔁 3 💬 1 📌 0

This is joint work with Felipe Gomez, Georgios Kaissis, @fcalmon.bsky.social, and @carmelatroncoso.bsky.social

Happy to chat about it online, and in 🇨🇦+🇺🇸 next two weeks:
- At the #NeurIPS2024 Friday Dec. 13 evening poster session.
- Will also present in more detail on Tuesday Dec. 17 at Harvard.

10.12.2024 03:11 👍 1 🔁 2 💬 1 📌 0

Flavio Calmon

Latest posts by Flavio Calmon @fcalmon