INDEX
Explanations
The neuron detects Cyrillic-script text (i.e. fragments of Russian words).
New Auto-Interp
Negative Logits
trois
-0.07
бес
-0.07
honour
-0.07
productId
-0.07
passe
-0.06
Img
-0.06
гро
-0.06
tostring
-0.06
fec
-0.06
附
-0.06
POSITIVE LOGITS
ATS
0.08
зап
0.07
widely
0.07
Зап
0.06
EMP
0.06
West
0.06
.sel
0.06
.launch
0.06
Ws
0.06
ISTR
0.06
Activations Density 0.009%