INDEX
Explanations
expressions related to emotional engagement and reactions
New Auto-Interp
Negative Logits
acher
-0.17
negot
-0.15
.Startup
-0.15
hec
-0.15
漫
-0.14
ór
-0.14
adamente
-0.14
znam
-0.14
Rak
-0.14
egers
-0.13
POSITIVE LOGITS
awe
0.33
speech
0.29
impressed
0.28
blown
0.28
spell
0.27
capt
0.25
Speech
0.25
fasc
0.25
wonder
0.24
Speech
0.24
Activations Density 0.361%