INDEX
Explanations
News articles
This neuron activates on inclusive, first-person plural framing—phrases like “let’s,” “we,” and similar collective calls or commentary in the text.
New Auto-Interp
Negative Logits
724
-0.07
preamble
-0.06
表
-0.06
rims
-0.06
.fade
-0.06
blade
-0.06
pill
-0.06
.bc
-0.05
$c
-0.05
totals
-0.05
POSITIVE LOGITS
Sit
0.07
Hin
0.07
vieux
0.07
Recogn
0.07
riz
0.07
πη
0.06
Ana
0.06
اهل
0.06
орон
0.06
.Adam
0.06
Activations Density 0.160%