INDEX
Explanations
dialogue
This neuron detects occurrences of the second‐person pronoun (“you”)—especially in constructions that frame rhetorical or admonishing questions.
New Auto-Interp
Negative Logits
ToolTip
-0.07
Integral
-0.07
entrance
-0.07
项
-0.06
roups
-0.06
align
-0.06
})↵↵
-0.06
signifies
-0.06
Modal
-0.06
terrific
-0.06
POSITIVE LOGITS
adını
0.06
soukrom
0.06
.ipv
0.06
leet
0.06
ิม
0.06
çı
0.06
_adv
0.06
Kaf
0.06
_favorite
0.06
orca
0.06
Activations Density 0.040%