INDEX
Explanations
narrations
The neuron fires on words and short phrases that express involuntary feelings or sensations (e.g. “couldn’t help but feel a sense of excitement/anticipation”).
New Auto-Interp
Negative Logits
blur
-0.07
acompan
-0.07
foobar
-0.07
کر
-0.06
principalColumn
-0.06
Because
-0.06
อนไลน
-0.06
misogyn
-0.06
возникает
-0.06
anomal
-0.06
POSITIVE LOGITS
entertained
0.06
CLE
0.06
parated
0.06
.netty
0.06
Options
0.06
captures
0.06
رت
0.06
Aleks
0.06
(dict
0.06
dominate
0.06
Activations Density 0.079%