INDEX
Explanations
NATO/ATO
The neuron strongly activates on occurrences of the token “NATO” (and related references to the alliance).
New Auto-Interp
Negative Logits
()); ↵ ↵
-0.07
')[
-0.07
DataRow
-0.06
wła
-0.06
iquer
-0.06
arson
-0.06
/values
-0.06
]): ↵
-0.06
esses
-0.06
قم
-0.06
POSITIVE LOGITS
NATO
0.12
placeholder
0.07
navigationController
0.07
antt
0.07
Katie
0.07
.navigationController
0.06
Bravo
0.06
Conservative
0.06
Caesar
0.06
*pi
0.06
Activations Density 0.001%