INDEX
Explanations
The neuron activates on the negative adverb “not.”
New Auto-Interp
Negative Logits
tadır
-0.07
いの
-0.07
would
-0.07
(inputStream
-0.07
.have
-0.07
plá
-0.07
Страна
-0.07
FilterWhere
-0.07
upakan
-0.07
CHEDULE
-0.07
POSITIVE LOGITS
(not
0.06
지나
0.06
Forgotten
0.06
ыџN
0.06
759
0.06
graf
0.06
Sniper
0.06
服
0.06
خط
0.06
nun
0.06
Activations Density 0.004%