INDEX
Explanations
This neuron activates on the conjunction “but,” identifying contrastive or adversative uses of “but” in text.
New Auto-Interp
Negative Logits
cds
-0.06
UNIQUE
-0.06
.dirname
-0.06
�除
-0.06
PLUS
-0.06
.life
-0.06
caller
-0.06
All
-0.06
undaki
-0.06
їх
-0.06
POSITIVE LOGITS
cur
0.06
istar
0.06
issippi
0.06
thuốc
0.06
/books
0.06
Coffee
0.06
istic
0.06
%",↵
0.06
Literature
0.06
riculum
0.06
Activations Density 0.002%