INDEX
Explanations
The neuron detects discourse-marker or transitional words and phrases (e.g. “for example,” “while,” “if,” “only,” etc.) that signal shifts, contrasts, or examples in the text.
New Auto-Interp
Negative Logits
tightening
-0.06
místo
-0.06
Т
-0.06
ihu
-0.05
ाइम
-0.05
дан
-0.05
人
-0.05
비스
-0.05
.magic
-0.05
両
-0.05
POSITIVE LOGITS
.urlencoded
0.07
.bg
0.07
_CLEAR
0.06
существ
0.06
ops
0.06
tecn
0.06
transpose
0.06
něm
0.06
'],↵
0.06
878
0.06
Activations Density 0.227%