The interpretability of a machine learning model’s inner workings. Oftentimes uses AI architecture with a high level of transparency (Intrinsically Interpretable Models) Important such that we can:
- Perform Causal Intervention to modify models
- View out current models and understand them safely