INDEX
Explanations
The neuron fires principally on non-English (foreign-language) tokens.
New Auto-Interp
Negative Logits
bw
-0.08
ад
-0.07
.constructor
-0.07
ungeons
-0.06
тим
-0.06
ať
-0.06
-ce
-0.06
subdiv
-0.06
flam
-0.06
rnd
-0.06
POSITIVE LOGITS
acer
0.07
nano
0.07
firm
0.07
says
0.07
fotograf
0.06
paylaş
0.06
-spec
0.06
Says
0.06
Cert
0.06
Curl
0.06
Activations Density 0.294%