INDEX
Explanations
emotional reactions and responses
expressions of strong negative emotions and reactions
New Auto-Interp
Negative Logits
vernment
-0.74
vre
-0.71
natureconservancy
-0.70
Flavoring
-0.69
imum
-0.68
ramids
-0.68
penet
-0.68
Quant
-0.67
UA
-0.66
iatures
-0.66
POSITIVE LOGITS
saddened
1.08
disappointed
0.95
disillusion
0.93
frustrated
0.92
embarrassed
0.92
dismay
0.91
humiliated
0.87
angry
0.87
appalled
0.87
nervous
0.86
Activations Density 0.393%