INDEX
Explanations
offering
This neuron fires on longer tokens (especially multi-syllable words), essentially acting as a “long-word” detector.
New Auto-Interp
Negative Logits
lots
-0.08
.Line
-0.06
Xt
-0.06
Bed
-0.06
اینترنتی
-0.06
Coc
-0.06
Hor
-0.06
nhỏ
-0.06
Svg
-0.06
-helper
-0.06
POSITIVE LOGITS
광
0.06
monds
0.06
residues
0.06
投
0.06
_placeholder
0.06
indices
0.06
quelle
0.05
кам
0.05
重
0.05
vše
0.05
Activations Density 0.214%