INDEX
Explanations
non-English characters
This neuron activates on non-ASCII accented characters (i.e. letters with diacritical marks).
New Auto-Interp
Negative Logits
Pent
-0.09
fart
-0.07
prefab
-0.07
приєм
-0.07
cort
-0.07
Get
-0.07
ueling
-0.06
DT
-0.06
filtr
-0.06
059
-0.06
POSITIVE LOGITS
Musk
0.07
�
0.07
se
0.06
Monster
0.06
ятся
0.06
_logging
0.06
쪽
0.06
gré
0.06
chure
0.06
Draws
0.06
Activations Density 0.004%