A phenomena that occurs in multi-agent environments with Scalable Oversight where a model that has a evaluator/reward model Prompt Injects the reward model if it knows it is a AI.