We analyze two things:

  • Capabilities (upper bounds of what model can do)
  • Propensitives (model behavior tendencies)

Capability Evaluations

Propensities Evaluation