INDEX
Explanations
The neuron flags subjective or evaluative language—words that express emotion, opinion, or judgment.
New Auto-Interp
Negative Logits
(Profile
-0.06
आ
-0.06
erd
-0.06
enheim
-0.06
Vault
-0.06
oldValue
-0.06
workout
-0.06
Biz
-0.06
ISTORY
-0.06
Attend
-0.06
POSITIVE LOGITS
Т
0.07
anal
0.07
CONT
0.06
перест
0.06
串
0.06
belang
0.06
hurting
0.06
currencies
0.06
piş
0.06
pron
0.06
Activations Density 0.182%