AI Watch

Welcome! This is a website to track people and organizations working on AI safety. See the code repository for the source code and data of this website.

This website is developed by Issa Rice with data contributions from Sebastian Sanchez, Amana Rice, and Vipul Naik, and has been partially funded by Vipul Naik and Mati Roy (who in July 2023 paid for the time Issa had spent answering people’s questions about AI Watch up until that point).

If you like (or want to like) this website and have money: the current funder is mostly only funding data updates to existing organizations as well as adding data for some new effective altruist organizations. As a result, the site is not getting any new features or improvements in design. If you want to bring this site to the next level, contact Issa at What you get: site improvements, recognition in the site credits. What the site needs: money.

If you have time and want experience building websites: this website is looking for contributors. If you want to help out, contact Issa at What you get: little or no pay (this could change if the site gets funding; see previous paragraph), recognition in the site credits, privilege of working with me, knowledge of the basics of web development (MySQL, PHP, Git). What the site needs: data collection/entry and website code improvements.

Last updated on 2023-09-07; see here for a full list of recent changes.

Table of contents


Agenda name Associated people Associated organizations
Iterated amplification Paul Christiano, Buck Shlegeris, Dario Amodei OpenAI
Embedded agency Eliezer Yudkowsky, Scott Garrabrant, Abram Demski Machine Intelligence Research Institute
Comprehensive AI services Eric Drexler Future of Humanity Institute
Ambitious value learning Stuart Armstrong Future of Humanity Institute
Factored cognition Andreas Stuhlmüller Ought
Recursive reward modeling Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg Google DeepMind
Debate Paul Christiano OpenAI
Interpretability Christopher Olah
Inverse reinforcement learning
Preference learning
Cooperative inverse reinforcement learning
Imitation learning
Alignment for advanced machine learning systems Jessica Taylor, Eliezer Yudkowsky, Patrick LaVictoire, Andrew Critch Machine Intelligence Research Institute
Learning-theoretic AI alignment Vanessa Kosoy
Counterfactual reasoning Jacob Steinhardt

Positions grouped by person

Showing 0 people with positions.

Name Number of organizations List of organizations

Positions grouped by organization

Showing 6 organizations.

Organization Number of people List of people
GoodAI 34 Karolína H., Ryan Camilleri, Jose Solorzano, Alex Angelini, Sarka Krejcova, Stephanie Wendler, Reham Bukhari, Viktorie Knezkova, Steffen Eichler, Petr Šimánek, Isabeau Premont-Schwarz, Petr Šrámek, Michal Dvořák, Christine Lee, Will Millership, Lucie Krestova, Marek Havrda, Jan Feyereisl, Olga Afanasjeva, Martin Poliak, Marek Rosa, Simon Andersson, Přemek Paška, Jaroslav Vitku, Wendelin Boehmer, Šimon Šicko, Shantesh Patil, Petr Hlubuček, Nicholas Guttenberg, Lucia Šicková, Joseph Davidson, Jan Štafa, Filip Hauptfleisch, Dominik Čech
Machine Intelligence Research Institute 6 Carson Jones, Kurt Brown, Aaron Silverbook, Jesse Galef, Elizabeth Morningstar, Erica Edelman
Berkeley Existential Risk Initiative 1 Colleen Gleason
EthicsNet 1 Aleksandra Orchowska
Foundational Research Institute 1 Max Daniel
Global Catastrophic Risk Institute 1 Robert de Neufville