INDEX
Explanations
trailing
This neuron activates on the word “trailing.”
New Auto-Interp
Negative Logits
gắng
-0.07
442
-0.07
وة
-0.07
onian
-0.07
convergence
-0.07
_characters
-0.06
dě
-0.06
beat
-0.06
exempt
-0.06
Russians
-0.06
POSITIVE LOGITS
trailing
0.08
Blizzard
0.07
Tri
0.06
(Pointer
0.06
_bl
0.06
توضی
0.06
aydın
0.06
printk
0.06
μαζί
0.06
piel
0.06
Activations Density 0.002%