Objective we specify does not match what we want. We as humans fail to specify a goal
Examples
- Optimize for engagement → Optimizes for outrage rather than wellbeing
Objective we specify does not match what we want. We as humans fail to specify a goal