The idea that AI systems will be misaligned by default. Motivated by Orthogonality Thesis
For
Against
- Maybe Orthogonality Thesis is false
- Instrumental convergence may cause values in our favor
- How does AI get money? It produces value for someone
- How does AI get people to help? It has to be trustworthy