🍈 Zettelkasten

❯

Machine Learning

❯

Model Transparency

Model Transparency

Mar 10, 20261 min read

machine_learning

The interpretability of a machine learning model’s inner workings. Oftentimes uses AI architecture with a high level of transparency (Intrinsically Interpretable Models) Important such that we can:

Perform Causal Intervention to modify models
View out current models and understand them safely

Camps

Basic Science Interpretability
Pragmatic Intepretability

Methods

Mech Interp
Neurointerpretability
Top-Down Interpretability

Graph View

Camps
Methods

Backlinks

AI Metrics
Arguments Against Alignment
Basic Science Interpretability
Bluedot Technical AI Safety
Explainable Machine Learning
MRI Scan for AI
Pragmatic Intepretability
Semi Perpendicular Text Embeddings
Top-Down Interpretability
Mechanistic Interpretability

Created with Quartz v4.4.0 © 2026

GitHub
Discord Community