INDEX
Explanations
emotional responses and expressions related to personal feelings
New Auto-Interp
Negative Logits
regor
-0.18
ALSE
-0.16
-ни
-0.15
Flem
-0.15
amespace
-0.15
MOTE
-0.15
oras
-0.14
discrepan
-0.14
ivement
-0.14
.raises
-0.14
POSITIVE LOGITS
aware
0.27
feel
0.23
aware
0.23
proud
0.21
uncomfortable
0.20
Aware
0.19
comfortable
0.18
sad
0.17
ako
0.17
gig
0.17
Activations Density 0.036%