INDEX
Explanations
This neuron isn’t detecting any particular pattern in these snippets—it remains inactive across all tokens.
New Auto-Interp
Negative Logits
finest
-0.07
گفت
-0.06
groot
-0.06
REG
-0.06
.edit
-0.06
.sprites
-0.06
tourist
-0.06
cath
-0.06
iences
-0.05
القد
-0.05
POSITIVE LOGITS
/manual
0.08
ío
0.07
نفسه
0.07
yaz
0.06
()["
0.06
autos
0.06
nalez
0.06
Exercises
0.06
LOB
0.06
ůvod
0.06
Activations Density 0.000%