INDEX
Explanations
words related to negative emotions, particularly hatred and disgust
negative emotions and contempt
New Auto-Interp
Negative Logits
Ivy
-0.65
Ivy
-0.64
yore
-0.61
inadvert
-0.60
Epi
-0.59
Epi
-0.57
blurb
-0.57
Stride
-0.55
serif
-0.54
popsic
-0.54
POSITIVE LOGITS
hatred
0.77
ormais
0.55
inoxidable
0.49
odeur
0.45
hated
0.45
inoxid
0.45
älskar
0.44
shocked
0.44
élastique
0.43
hetto
0.43
Activations Density 0.007%