AI Watch

Welcome! This is a website to track people and organizations in the AI safety/alignment/AI existential risk communities. A position or organization being on AI Watch does not indicate an assessment that that position or organization is actually making AI safer or that the position or organization is good for the world in any way. It is mostly a sociological indication that the position or organization is associated with these communities, as well as an indication that the position or organization claims to be working on AI safety or alignment. (There are some plans to eventually introduce such assessments on AI Watch, but for now there are none.) See the code repository for the source code and data of this website.

This website is developed by Issa Rice with data contributions from Sebastian Sanchez, Amana Rice, and Vipul Naik, and has been partially funded by Vipul Naik and Mati Roy (who in July 2023 paid for the time Issa had spent answering people’s questions about AI Watch up until that point).

Last updated on 2024-11-02; see here for a full list of recent changes.

Positions grouped by person
Positions grouped by organization

Agendas

Agenda name	Associated people	Associated organizations
Iterated amplification	Paul Christiano, Buck Shlegeris, Dario Amodei	OpenAI
Embedded agency	Eliezer Yudkowsky, Scott Garrabrant, Abram Demski	Machine Intelligence Research Institute
Comprehensive AI services	Eric Drexler	Future of Humanity Institute
Ambitious value learning	Stuart Armstrong	Future of Humanity Institute
Factored cognition	Andreas Stuhlmüller	Ought
Recursive reward modeling	Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg	Google DeepMind
Debate	Paul Christiano	OpenAI
Interpretability	Christopher Olah
Inverse reinforcement learning
Preference learning
Cooperative inverse reinforcement learning
Imitation learning
Alignment for advanced machine learning systems	Jessica Taylor, Eliezer Yudkowsky, Patrick LaVictoire, Andrew Critch	Machine Intelligence Research Institute
Learning-theoretic AI alignment	Vanessa Kosoy
Counterfactual reasoning	Jacob Steinhardt

Positions grouped by person

Showing 0 people with positions.

Name	Number of organizations	List of organizations

Positions grouped by organization

Showing 5 organizations.

Organization	Number of people	List of people
AI Impacts	13	Daniel Kokotajlo, Asya Bergal, Ronja Lutz, Richard Korzekwa, Tegan McCaslin, Paul Christiano, Jimmy Rintjema, Ben Hoffman, Justis Mills, Connor Flexman, Finan Adamson, John Salvatier, Stephanie Zolayvar
Machine Intelligence Research Institute	5	Dávid Natingga, Sebastian Nickel, Ben Hoskin, Frank Adamek, Alyssa Vance
Foundational Research Institute	3	Brian Tomasik, Kaj Sotala, Lukas Gloor
EthicsNet	1	Anish Mohammed
Future of Humanity Institute	1	Anders Sandberg

AI Watch

Table of contents

Agendas

Positions grouped by person

Positions grouped by organization