INDEX
Explanations
foreign languages
This neuron activates on non-English text segments—especially tokens with diacritics or Cyrillic characters.
New Auto-Interp
Negative Logits
Coefficient
-0.07
็ด
-0.07
ーツ
-0.07
Traits
-0.07
CPPUNIT
-0.06
SubMenu
-0.06
pretext
-0.06
итет
-0.06
whistleblower
-0.06
즌
-0.06
POSITIVE LOGITS
/*!↵
0.07
некоторых
0.06
skl
0.06
tidak
0.06
結果
0.06
metodo
0.06
ceux
0.06
-[
0.06
[${0.06
*******
0.06
Activations Density 0.069%