An approach to Interpretability that captures emergent properties (RepE) as opposed to neurons or Circuit. Adheres to the Reverse Engineer Complexity Pitfall. More close to GOFAI.
An approach to Interpretability that captures emergent properties (RepE) as opposed to neurons or Circuit. Adheres to the Reverse Engineer Complexity Pitfall. More close to GOFAI.