INDEX
Explanations
The neuron flags Spanish words (esp. in a question), i.e. it activates on Spanish-language tokens.
New Auto-Interp
Negative Logits
(Arrays
-0.07
cave
-0.07
ADDING
-0.07
(it
-0.06
helper
-0.06
Herr
-0.06
addon
-0.06
_clean
-0.06
Tutor
-0.06
الولايات
-0.06
POSITIVE LOGITS
.datas
0.06
jButton
0.06
shampoo
0.06
исключ
0.06
?><
0.06
ejac
0.06
'', ↵
0.06
。',↵
0.06
_PM
0.06
.BorderSize
0.06
Activations Density 0.169%