INDEX
Explanations
words related to negative emotions, particularly focusing on the feeling of anger
expressions of anger
New Auto-Interp
Negative Logits
Operator
-0.69
Vide
-0.69
Wonders
-0.66
artifacts
-0.66
Places
-0.65
snug
-0.63
livest
-0.62
MER
-0.62
ramer
-0.61
amins
-0.61
POSITIVE LOGITS
fulness
1.03
ful
0.95
rained
0.94
FUL
0.88
ingly
0.86
wart
0.85
fully
0.82
lessness
0.82
lessly
0.82
anger
0.80
Activations Density 0.046%