INDEX
Explanations
words describing shyness or timid behavior
New Auto-Interp
Negative Logits
rant
-0.15
olf
-0.15
FG
-0.15
adh
-0.15
aze
-0.14
ijn
-0.14
onya
-0.14
Pad
-0.14
ffee
-0.13
snag
-0.13
POSITIVE LOGITS
itness
0.16
äºİ
0.16
ispers
0.15
äºİ
0.15
æĸ¼
0.15
NESS
0.15
à¹Ĩ
0.14
perty
0.14
zeitig
0.14
takson
0.14
Activations Density 0.007%