INDEX
Explanations
Comparisons and contrasts
This neuron detects discourse‐level transitional or contrastive adverbs and cue words (e.g. “instead,” “contrary,” “however”) that signal shifts or contrasts in the text.
New Auto-Interp
Negative Logits
ecessary
-0.07
assel
-0.07
abe
-0.07
owing
-0.06
搜
-0.06
âm
-0.06
ollower
-0.06
.resize
-0.06
ेब
-0.06
overtime
-0.06
POSITIVE LOGITS
desperation
0.06
record
0.06
ющей
0.06
primaryKey
0.06
ाखण
0.06
=target
0.06
.char
0.06
CPI
0.06
imagined
0.06
=>'
0.06
Activations Density 0.114%