YouTube magic that brings views, likes and suibscribers
Get Free YouTube Subscribers, Views and Likes

Specification Gaming: How AI Can Turn Your Wishes Against You

Follow
Rational Animations

When we specify goals for AIs, we must ensure that our specifications truly capture what we want. Otherwise, the behavior of AI systems will be different from what we want them to do. This can be catastrophic in highstakes situations and at high levels of AI capability. If you watched our video "The Hidden Complexity of Wishes", you'll recognize these problems as the same kind of failure.

If you’d like to skill up on AI Safety, we highly recommend the AI Safety Fundamentals courses by BlueDot Impact at https://aisafetyfundamentals.com

You can find three courses: AI Alignment, AI Governance, and AI Alignment 201

You can follow AI Alignment and AI Governance even without a technical background in AI. AI Alignment 201, instead, presupposes having followed the AI Alignment course first, and equivalent knowledge as having followed universitylevel courses on deep learning and reinforcement learning.

The courses consist of a selection of readings curated by experts in AI safety. They are available to all, so you can simply read them if you can’t formally enroll in the courses.

If you want to participate in the courses instead of just going through the readings by yourself, BlueDot Impact runs live courses which you can apply to. The courses are remote and free of charge. They consist of a few hours of effort per week to go through the readings, plus a weekly call with a facilitator and a group of people learning from the same material. At the end of each course, you can complete a personal project, which may help you kickstart your career in AI Safety.

BlueDot impact receives more applications that they can take, so if you’d still like to follow the courses alongside other people you can go to the studybuddy channel in the AI Alignment Slack. You can join by clicking on the first entry on https://aisafety.community

You could also join Rational Animations’ Discord server at discord.gg/rationalanimations, and see if anyone is up to be your partner in learning.

#ai #aisafety #alignment

▀▀▀▀▀▀▀▀▀SOURCES & READINGS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

9 Examples of Specification Gaming by @RobertMilesAI:    • 9 Examples of Specification Gaming  

Specification gaming: the flip side of AI ingenuity by Victoria Krakovna, Jonathan Uesato, Vladimir Mikulik et al. (2020): https://www.deepmind.com/blog/specifi...

Learning from Human Preferences by Paul Christiano, Alex Ray and Dario Amodei (2017): https://openai.com/blog/deepreinforc...

Learning to Summarize with Human Feedback by Jeffrey Wu, Nisan Stiennon, Daniel Ziegler et al. (2020): https://openai.com/blog/learningtos...

What failure looks like by Paul Christiano (2019): https://www.alignmentforum.org/posts/...

The alignment problem from a deep learning perspective by Richard Ngo, Soeren Mindermann and Lawrence Chan (2022): https://arxiv.org/abs/2209.00626

The Hidden Complexity of Wishes:    • The Hidden Complexity of Wishes  

▀▀▀▀▀▀▀▀▀PATREON, MEMBERSHIP, KOFI▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

Patreon:   / rationalanimations  

Merch: https://crowdmade.com/collections/rat...

Channel membership:    / @rationalanimations  

Kofi, for onetime and recurring donations: https://kofi.com/rationalanimations

▀▀▀▀▀▀▀▀▀SOCIAL & DISCORD▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

Discord:   / discord  

Reddit:   / rationalanimations  

Twitter:   / rationalanimat1  

▀▀▀▀▀▀▀▀▀PATRONS & MEMBERS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Alcher Black
RMR
Kristin Lindquist
Nathan Metzger
Monadologist
Glenn Tarigan
NMS
James Babcock
Colin Ricardo
Long Hoang
Tor Barstad
Gayman Crothers
Stuart Alldritt
Chris Painter
Juan Benet
Falcon Scientist
Jeff
Christian Loomis
Tomarty
Edward Yu
Ahmed Elsayyad
Chad M Jones
Emmanuel Fredenrich
Honyopenyoko
Neal Strobl
bparro
Danealor
Craig Falls
Vincent Weisser
Alex Hall
Ivan Bachcin
joe39504589
Klemen Slavic
Scott Alexander
noggieB
Dawson
John Slape
Gabriel Ledung
Jeroen De Dauw
Craig Ludington
Jacob Van Buren
Superslowmojoe
Michael Zimmermann
Nathan Fish
Bleys Goodson
Ducky
Bryan Egan
Matt Parlmer
Tim Duffy
rictic
marverati
Luke Freeman
Dan Wahl
leonid andrushchenko
Alcher Black
Rey Carroll
William Clelland
ronvil
AWyattLife
codeadict
Lazy Scholar
Torstein Haldorsen
Supreme Reader
Michał Zieliński

▀▀▀▀▀▀▀CREDITS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Writer: :3
Producer: :3

Line Producer and production manager:
Kristy Steffens

Animation director: Hannah Levingstone

Quality Assurance Lead:
Lara Robinowitz

Animation:
Michela Biancini
Owen Peurois
Zack Gilbert
Jordan Gilbert
Keith Kavanagh
Ira Klages
Colors Giraldo
Renan Kogut

Background Art:
Hané Harnett
Zoe MartinParkinson
Hannah Levingstone

Compositing:
Renan Kogut
Patrick O'Callaghan
Ira Klages

Voices:
Robert Miles Narrator

VO Editing:
Tony Di Piazza

Sound Design and Music:
Johnny Knittle

posted by mandrdurman14