AI could be our biggest existential threat this century. If you enjoyed this video, here are some places to find out more about these ideas:
Human compatible US: https://amzn.to/3Pdi0qS UK: https://amzn.to/463vawM by Stuart ‘You can’t fetch the coffee if you’re dead’ Russell
The Alignment Problem: Machine Learning and Human Values US: https://amzn.to/3N7cLpV UK: https://amzn.to/45YMZgq by Brian Christian
@eightythousandhours’s problem profile on ‘Preventing an AIrelated catastrophe’: https://80000hours.org/problemprofil...
@RobertMilesAI’s channel: / robertmilesai
Read the worst cat/sat/matbased short story ever written here: https://andrewsteele.co.uk/blog/2023/...
Amazon links are affiliates and I will receive a small payment if you choose to purchase through them. Thanks!
Chapters
00:00 Introduction
02:04 How does ChatGPT work?
06:48 Problem 0: AI misuse
08:01 Problem 1: AI is an alien mind
11:18 Problem 2: Defining goals is hard
17:05 Problem 3: ‘Instrumental convergence’
19:17 Problem 4: Exponential progress
22:32 What can we do?
Sources and further reading
On AI being an alien mind, I really enjoyed this video from @kylehill on a hilarious flaw in DeepMind’s Goplaying AI, which handily beat world champion Go players…but, knowing this flaw, was easily beaten by an amateur • ChatGPT's HUGE Problem
This is a Twitter thread from me digesting 2022 results in AI in (then)realtime, and speculating about whether these capabilities indicate that AI could ‘do science’ / 1511722732257480711
Introduction
ChatGPT’s user growth https://www.reuters.com/technology/ch...
Try hilariously bad 2020 texttoimage generator XLXMERT here: https://visionexplorer.allenai.org/t...
Run Stable Diffusion locally using its web UI: https://github.com/AUTOMATIC1111/stab...
‘Sony World Photography Award 2023: Winner refuses award after revealing AI creation’ – BBC News https://www.bbc.com/news/entertainmen...
How does ChatGPT work?
An absolutely humungous list of papers about LLMs https://github.com/Hannibal046/Awesom...
GPT and other LLMs don’t usually work on the word level, they actually normally work on ‘tokens’—many of which are words, but not all of which are. You can get a sense for the difference by trying out OpenAI’s Tokenizer, here https://platform.openai.com/tokenizer
Emergent abilities of large language models https://openreview.net/pdf?id=yzkSU5zdwD
ChatGPT playing chess https://www.lesswrong.com/posts/xyjhF...
Problem 1: AI is an alien mind
Paper on using psychedelic specs to fool facial recognition AI https://users.ece.cmu.edu/~lbauer/pap...
‘Psychedelic toasters fool image recognition tech’ – BBC News https://www.bbc.com/news/technology4...
Thread on how little we know about how ChatGPT works—including an absolutely baffling algorithm it uses internally to add numbers together! / 1663534255249453056
Problem 2: Defining your goals
More about OpenAI’s CoastRunnersmashing reinforcement learning algorithm https://openai.com/research/faultyre...
Astrophysicist Grant Tremblay correcting Bard on Twitter / 1623091683603918849
Problem 3: Instrumental convergence
Great video with Rob Miles about how hard it is to build an off switch for an AI • AI "Stop Button" Problem Computerphile
Problem 4: Exponential progress
Article on how ChatGPT can help with code (and its limitations) https://www.nature.com/articles/d4158...
GPT4 cost over $100m to train https://www.wired.com/story/openaice...
What can we do?
AI governance is a huge field, and a good overview of resources can be found at https://80000hours.org/problemprofil... (link should take you straight to the AI governance and strategy’ heading)
Errata
I should probably have said GPTR4 ‘may’ have 1 trillion parameters, because this hasn’t actually been made public. In the absence of a definitive source, this comment thread discusses the issue: • How AI could destroy the world by acc...
Credits
Milla Jovovich image CC BYSA Georges Biard https://upload.wikimedia.org/wikipedi...
And finally…
Follow me on Twitter / statto
Follow me on Instagram / andrewjsteele
Like my page on Facebook / drandrewsteele
Follow me on Mastodon https://mas.to/@statto
Read my book, Ageless: The new science of getting older without getting old https://ageless.link/