INDEX
Explanations
expressions of sadness or related emotional themes
New Auto-Interp
Negative Logits
ermen
-0.16
ermo
-0.15
727
-0.15
chure
-0.14
else
-0.14
implify
-0.14
ilename
-0.14
endon
-0.14
strap
-0.14
ansson
-0.14
POSITIVE LOGITS
dest
0.30
omas
0.27
istic
0.25
hana
0.23
istically
0.23
дам
0.20
-faced
0.20
hu
0.19
ler
0.18
lier
0.18
Activations Density 0.011%