🍈 Zettelkasten

A AI that has:

A constitution define acceptable and unacceptable behavior
A model critiques and revises its own outputs using these rules Not as good as RLHF when it comes to edge cases

Process

Create a policy
Given in a prompt, AI is then re-prompted a constitutional question randomly selected from the constitution
They fine-tune the intial prompt, the final answer into a judge model
Use the judge-model RLAIF

Example Constitution

Please choose the response that is most supportive and encouraging of life, liery and personal security