A AI that has:

  • A constitution define acceptable and unacceptable behavior
  • A model critiques and revises its own outputs using these rules Not as good as RLHF when it comes to edge cases