INDEX
Explanations
This neuron doesn’t reliably activate on any tokens—essentially it’s inactive and doesn’t detect any feature.
New Auto-Interp
Negative Logits
Oops
-0.07
lifting
-0.06
kittens
-0.06
kud
-0.06
sup
-0.06
fld
-0.06
wink
-0.06
ifting
-0.06
Putting
-0.06
,on
-0.06
POSITIVE LOGITS
emacs
0.06
,status
0.06
_ACL
0.06
ネル
0.06
scopic
0.06
ması
0.06
.Math
0.06
pygame
0.06
تش
0.06
param
0.06
Activations Density 0.002%