INDEX
Explanations
dialogue
This neuron detects numbered placeholder tokens (e.g. “NAME_1”, “NAME_2”) used to label speakers or items.
New Auto-Interp
Negative Logits
IRC
-0.07
Authority
-0.07
EventBus
-0.07
nextState
-0.06
unusually
-0.06
応
-0.06
建立
-0.06
arlo
-0.06
.On
-0.06
pasado
-0.06
POSITIVE LOGITS
dwarf
0.06
Сан
0.06
medicine
0.06
diversion
0.06
=random
0.06
ırak
0.06
OLID
0.06
depressive
0.06
good
0.06
ΙΟ
0.06
Activations Density 0.005%