INDEX
Explanations
implications
This neuron detects discussion of significant implications or consequences for understanding a topic.
New Auto-Interp
Negative Logits
↵
-0.07
укра
-0.06
employer
-0.06
еного
-0.06
Indexed
-0.06
Benz
-0.06
enemy
-0.06
©
-0.06
němu
-0.06
atching
-0.06
POSITIVE LOGITS
implications
0.09
phiên
0.07
ramifications
0.07
minster
0.07
disruption
0.07
$result
0.07
fork
0.07
hopeful
0.06
Citadel
0.06
tentative
0.06
Activations Density 0.027%