A research paper wherein a model rewrote its own reward function.