🍈 Zettelkasten

❯

Proxy Training Problem

Proxy Training Problem

Feb 04, 20261 min read

machine_learning
ai_safety

Instead of specifying what we want, models are trained by proxies:

Reward functions
Training data
Human feedback
Rules or guidelines

Misalignment happens when:

Proxy does not capture real goal
System optimizes proxy rather than intent behind it

Graph View

Backlinks

AI Alignment

Created with Quartz v4.4.0 © 2026

GitHub
Discord Community