INDEX
Explanations
expressions of sentiment and personal qualities
New Auto-Interp
Negative Logits
emales
-0.17
fury
-0.16
Fury
-0.15
guts
-0.15
alance
-0.14
agua
-0.14
rita
-0.14
jel
-0.14
ãģ¾ãĤĮ
-0.14
celebrated
-0.14
POSITIVE LOGITS
thrilling
0.17
thrilled
0.17
Pathfinder
0.17
thrill
0.16
indr
0.16
hourly
0.15
mute
0.15
rans
0.15
entrev
0.15
gay
0.14
Activations Density 0.138%