🍈 Zettelkasten

❯

Rogue AI Risk Case

Rogue AI Risk Case

Mar 10, 20262 min read

ai_safety

An argument of Existential Catastrophe arising from all intances of AGI that is Rogue AI

Argument

P1: Artificial Superintelligence can be achieved
- If we continue on this path, substantial probability that next centuries, AI capabilities reach ASI levels.
P2: Agentic AI will arise that will be goal directed
- if goal directed superhuman AI is built, their output will be as bad as an empty universe by human lights
- Superhuman AI will inherently, even if its an Oracle Machine, can be used with a Agent Harness to make it agentic
P3: Default Misalignment: If most goal directed ASI will hav bad goals, the future will be very bad
P4: Alignment is difficult. Checking if AI is hard because we don’t know until deployment
P5: Catastrophic Capability: AI agent misaligned goals → Catastrophic outcome for humanity
C: We have non-trivial risk of Existential Catastrophe from Ai development

Counter Arguments

AGI Is Not Goal Directed
Goal directed systems may not always be bad
- Small differences in utility functions may not be catastrophic
- Human and AI values may be similar
- Maybe value isnt fragile
Human success isnt from individual intelligence
AI agents may not be radically superior to combinations of humans and non-agentic machines
Intelligence may not be an overwhelming advantage
Unclear if many goals incentive taking over the universe
Speed of intelligence growth is ambiguous
Argument proves too much about corporations
If we If there is a case where AI agents are misaligned, we should be able to see this before it becomes uncontrollable

Graph View

Argument
Counter Arguments

Backlinks

Ethics of Artificial Intelligence

Created with Quartz v4.4.0 © 2026

GitHub
Discord Community