INDEX
Explanations
variants
This neuron detects speaker-label tokens (role prefixes) in a chat transcript, such as “GPT:”, “NAME_2:”, etc.
New Auto-Interp
Negative Logits
독
-0.06
.Popup
-0.06
urry
-0.06
�
-0.06
inhal
-0.06
ENCH
-0.06
itemid
-0.06
itm
-0.05
survey
-0.05
unsafe
-0.05
POSITIVE LOGITS
Ticaret
0.07
yytype
0.07
()"↵
0.07
Initialize
0.07
APON
0.07
│
0.06
becue
0.06
yaklaşık
0.06
.contentSize
0.06
##↵
0.06
Activations Density 0.002%