Information for Vishal

Table of contents

Basic information
List of positions
Products
Organization documents
Documents
Similar people

Basic information

Item	Value
Donations List Website (data still preliminary)
Agendas	Recursive reward modeling

List of positions (0 positions)

Organization	Title	Start date	End date	AI safety relation	Subject	Employment type	Source	Notes

Products (0 products)

Name	Creation date	Description

Organization documents (0 documents)

Title	Publication date	Author	Publisher	Affected organizations	Affected people	Document scope	Cause area	Notes

Documents (1 document)

Title	Publication date	Author	Publisher	Affected organizations	Affected people	Affected agendas	Notes
Scalable agent alignment via reward modeling: a research direction	2018-11-19	Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg	arXiv	Google DeepMind		Recursive reward modeling, Imitation learning, inverse reinforcement learning, Cooperative inverse reinforcement learning, myopic reinforcement learning, iterated amplification, debate	This paper introduces the (recursive) reward modeling agenda, discussing its basic outline, challenges, and ways to overcome those challenges. The paper also discusses alternative agendas and their relation to reward modeling.

Similar people

Showing at most 20 people who are most similar in terms of which organizations they have worked at.

Person	Number of organizations in common	List of organizations in common