INDEX
Explanations
This neuron does not activate on any tokens—it isn’t detecting any pattern.
New Auto-Interp
Negative Logits
azo
-0.07
302
-0.07
31
-0.06
ROLLER
-0.06
28
-0.06
-building
-0.06
improper
-0.06
25
-0.06
travel
-0.06
29
-0.06
POSITIVE LOGITS
馆
0.06
[--
0.06
filtro
0.06
户
0.06
////↵
0.06
wcs
0.06
Creed
0.06
야
0.06
Capcom
0.06
.sat
0.06
Activations Density 0.016%