Search
❯
Feb 23, 20261 min read
A case wherein a AI trained to play games Reward Hacked its reward function and went into loops.