INDEX
Explanations
words related to description of appearance or situations
phrases indicating a superficial or first impression evaluation
New Auto-Interp
Negative Logits
anwhile
-0.66
elsen
-0.63
icer
-0.62
artney
-0.61
cellaneous
-0.60
umbn
-0.60
rench
-0.60
die
-0.60
gypt
-0.60
rus
-0.60
POSITIVE LOGITS
speaking
0.86
glance
0.77
it
0.70
terms
0.65
yes
0.65
anyway
0.65
this
0.62
speaking
0.60
blush
0.60
sounding
0.58
Activations Density 0.119%