INDEX
Explanations
The neuron activates on contrastive or adversative discourse cues—words like “but,” “however,” “also,” etc.—that signal a shift or qualification in the narrative.
New Auto-Interp
Negative Logits
ffield
-0.07
whiteColor
-0.06
eldo
-0.06
err
-0.06
_NT
-0.06
_SHADOW
-0.06
mensaje
-0.06
izontal
-0.06
Eph
-0.06
provided
-0.06
POSITIVE LOGITS
but
0.08
BUT
0.07
ROOM
0.06
minimizing
0.06
Hungarian
0.06
Room
0.06
encing
0.06
situations
0.06
Species
0.06
@FindBy
0.06
Activations Density 0.119%