INDEX
Explanations
The neuron strongly activates on marketing-style endorsement language—i.e. first-person product sponsorship or promotional hype (words like “opportunity,” “have,” brand names, and praising verbs).
New Auto-Interp
Negative Logits
pute
-0.07
.ErrorMessage
-0.07
ervlet
-0.07
insiders
-0.06
r
-0.06
creature
-0.06
granny
-0.06
.filePath
-0.06
.protobuf
-0.06
рев
-0.06
POSITIVE LOGITS
cerpt
0.07
.")↵
0.07
extended
0.07
korun
0.06
("|0.06
weiß
0.06
Auf
0.06
'))↵↵
0.06
momentum
0.06
dept
0.06
Activations Density 0.100%