INDEX
Explanations
This neuron detects mentions of the state or government (e.g., words like “state,” “government,” etc.).
New Auto-Interp
Negative Logits
Riy
-0.07
遍
-0.06
навер
-0.06
尖
-0.06
_my
-0.06
cavern
-0.06
โต
-0.06
المع
-0.06
ungeon
-0.06
_OPTS
-0.05
POSITIVE LOGITS
Slim
0.07
(Date
0.07
State
0.07
.platform
0.07
Estado
0.07
Shall
0.06
?>:</
0.06
σκ
0.06
webcam
0.06
государ
0.06
Activations Density 0.009%