Information for Miljan Martic

Table of contents

Basic information

Item Value
Agendas Recursive reward modeling

List of positions (2 positions)

Organization Title Start date End date AI safety relation Subject Employment type Source Notes
Google DeepMind Reserach Engineer 2017-03-01 2020-10-31 AGI organization [1], [2], [3]
Google DeepMind Senior Reserach Engineer 2020-10-01 AGI organization [1], [2], [3]

Products (0 products)

Name Creation date Description

Organization documents (0 documents)

Title Publication date Author Publisher Affected organizations Affected people Document scope Cause area Notes

Documents (1 document)

Title Publication date Author Publisher Affected organizations Affected people Affected agendas Notes
Scalable agent alignment via reward modeling: a research direction 2018-11-19 Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg arXiv Google DeepMind Recursive reward modeling, Imitation learning, inverse reinforcement learning, Cooperative inverse reinforcement learning, myopic reinforcement learning, iterated amplification, debate This paper introduces the (recursive) reward modeling agenda, discussing its basic outline, challenges, and ways to overcome those challenges. The paper also discusses alternative agendas and their relation to reward modeling.

Similar people

Showing at most 20 people who are most similar in terms of which organizations they have worked at.

Person Number of organizations in common List of organizations in common
Nick Bostrom 1 Google DeepMind
Sean Legassick 1 Google DeepMind
Vishal Maini 1 Google DeepMind
Pedro A. Ortega 1 Google DeepMind
Andrew Lefrancq 1 Google DeepMind
Tom Everitt 1 Google DeepMind
Shane Legg 1 Google DeepMind
Mustafa Suleyman 1 Google DeepMind
Verity Harding 1 Google DeepMind
Jeffrey D. Sachs 1 Google DeepMind
Chris Maddison 1 Google DeepMind
James Manyika 1 Google DeepMind
Christiana Figueres 1 Google DeepMind
Edward W. Felten 1 Google DeepMind
Diane Coyle 1 Google DeepMind
Victoria Krakovna 1 Google DeepMind
Demis Hassabis 1 Google DeepMind
Laurent Orseau 1 Google DeepMind
Jan Leike 1 Google DeepMind
Thore Graepel 1 Google DeepMind