INDEX
Explanations
This neuron activates on contrastive discourse markers—especially the adversative connector “however.”
New Auto-Interp
Negative Logits
цо
-0.07
的心
-0.06
.Post
-0.06
Meet
-0.06
Density
-0.06
ih
-0.06
_VARIABLE
-0.06
Ged
-0.06
cai
-0.06
Bethesda
-0.06
POSITIVE LOGITS
však
0.07
_unix
0.06
lius
0.06
λικ
0.06
zelf
0.06
.navCtrl
0.06
Alic
0.06
ysql
0.06
usted
0.06
udden
0.06
Activations Density 0.025%