Ryan Carey - Future of Humanity Institute

Ryan Carey

Research Fellow, AI Safety

Email · GitHub · Twitter · CV

Publications (5)

The Incentives that Shape Behaviour. (Ryan Carey, Eric Langlois, Tom Everitt, and Shane Legg, SafeAI@AAAI)
How useful is quantilization for mitigating specification-gaming? Ryan Carey. (SafeML ICLR 2019 Workshop)
(When) Is Truth-telling favored in AI Debate? (Vojtech Kovarik, Ryan Carey; SafeAI@AAAI)
Predicting Human Deliberative Judgments with Machine Learning. (Evans O, Stuhlmüller A, Cundy C, Carey R, Kenton Z, McGrath T & Schreiber A, 2018)
Incorrigibility in the CIRL Framework (Ryan Carey; Proceedings of AI, Ethics and Society)