INDEX
Explanations
This neuron activates on words expressing a statistical non-finding, especially “no” (or “not”) indicating no significant difference or effect.
New Auto-Interp
Negative Logits
justices
-0.07
/km
-0.07
contador
-0.07
legis
-0.06
Sciences
-0.06
(ra
-0.06
ات
-0.06
Chron
-0.06
explicit
-0.06
ct
-0.06
POSITIVE LOGITS
нак
0.07
.Allow
0.06
еи
0.06
inferior
0.06
ики
0.06
setC
0.06
sey
0.06
_OC
0.06
asla
0.06
网络
0.06
Activations Density 0.030%