INDEX
Explanations
This neuron fires on sentence‐ or clause‐level discourse markers and transitional words (e.g. “However,” “But,” “Upon,” “Since,” “Then,” “A,” etc.) that typically appear at the start of new sentences or clauses.
New Auto-Interp
Negative Logits
probably
-0.07
psychologists
-0.07
افزایش
-0.07
embroidered
-0.07
itions
-0.07
practically
-0.06
Mandarin
-0.06
durability
-0.06
-mails
-0.06
misuse
-0.06
POSITIVE LOGITS
lex
0.07
_Osc
0.07
.raw
0.06
теч
0.06
bron
0.06
razione
0.06
seus
0.06
enção
0.06
้น
0.06
avere
0.06
Activations Density 0.072%