Information for Jan Leike

Table of contents

Basic information

Item Value
Facebook username 100009882604264
Intelligent Agent Foundations Forum username 160
Agendas Recursive reward modeling

List of positions (5 positions)

Organization Title Start date End date AI safety relation Subject Employment type Source Notes
Australian National University [1]
Google DeepMind [2], [3]
Future of Humanity Institute Research Associate 2017-11-24 [4], [5]
Machine Intelligence Research Institute Research Advisor 2017-03-01 2018-05-01 position [6], [7]
Machine Intelligence Research Institute Spotlighted Advisor 2018-09-01 2018-09-02 position [8], [9]

Products (0 products)

Name Creation date Description

Organization documents (0 documents)

Title Publication date Author Publisher Affected organizations Affected people Document scope Cause area Notes

Documents (2 documents)

Title Publication date Author Publisher Affected organizations Affected people Affected agendas Notes
New safety research agenda: scalable agent alignment via reward modeling 2018-11-20 Victoria Krakovna LessWrong Google DeepMind Jan Leike Recursive reward modeling, iterated amplification Blog post on LessWrong announcing the recursive reward modeling agenda. Some comments in the discussion thread clarify various aspects of the agenda, including its relation to Paul Christiano’s iterated amplification agenda, whether the DeepMind safety team is thinking about the problem of whether the human user is a safe agent, and more details about alternating quantifiers in the analogy to complexity theory. Jan Leike is listed as an affected person for this document because he is the lead author and is mentioned in the blog post, and also because he responds to several questions raised in the comments.
Scalable agent alignment via reward modeling: a research direction 2018-11-19 Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg arXiv Google DeepMind Recursive reward modeling, Imitation learning, inverse reinforcement learning, Cooperative inverse reinforcement learning, myopic reinforcement learning, iterated amplification, debate This paper introduces the (recursive) reward modeling agenda, discussing its basic outline, challenges, and ways to overcome those challenges. The paper also discusses alternative agendas and their relation to reward modeling.

Similar people

Showing at most 20 people who are most similar in terms of which organizations they have worked at.

Person Number of organizations in common List of organizations in common
Nick Bostrom 3 Google DeepMind, Future of Humanity Institute, Machine Intelligence Research Institute
Tom Everitt 2 Australian National University, Google DeepMind
Victoria Krakovna 2 Google DeepMind, Machine Intelligence Research Institute
Ryan Carey 2 Future of Humanity Institute, Machine Intelligence Research Institute
Paul Christiano 2 Future of Humanity Institute, Machine Intelligence Research Institute
Robin Hanson 2 Future of Humanity Institute, Machine Intelligence Research Institute
Katja Grace 2 Future of Humanity Institute, Machine Intelligence Research Institute
Carl Shulman 2 Future of Humanity Institute, Machine Intelligence Research Institute
Daniel Dewey 2 Future of Humanity Institute, Machine Intelligence Research Institute
Stuart Armstrong 2 Future of Humanity Institute, Machine Intelligence Research Institute
Jarryd Martin 1 Australian National University
Marcus Hutter 1 Australian National University
Elliot Catt 1 Australian National University
Alan Hájek 1 Australian National University
Gary Lea 1 Australian National University
Chris Maddison 1 Google DeepMind
Laurent Orseau 1 Google DeepMind
Demis Hassabis 1 Google DeepMind
Diane Coyle 1 Google DeepMind
Edward W. Felten 1 Google DeepMind