Search
❯
Mar 03, 20261 min read
A case wherein a AI trained to play games Reward Hacked its reward function and went into loops.