INDEX
Explanations
strong negative emotional reactions such as horror or being appalled
emotional reactions of shock or disapproval
New Auto-Interp
Negative Logits
opio
-0.77
ramid
-0.72
amins
-0.66
eworks
-0.64
shortened
-0.64
pmwiki
-0.63
iggurat
-0.63
rha
-0.63
vati
-0.62
impro
-0.61
POSITIVE LOGITS
ingly
0.96
ĸļ
0.81
aback
0.77
exclaim
0.76
urous
0.75
sbm
0.72
ãĤ¦ãĤ¹
0.70
lys
0.69
disbelief
0.68
amaz
0.67
Activations Density 0.089%