INDEX
Explanations
descriptive adjectives, particularly the word "pretty"
New Auto-Interp
Negative Logits
ooth
-0.17
odelist
-0.17
asca
-0.16
δÏģα
-0.16
vary
-0.15
да
-0.14
нÑĤ
-0.14
principal
-0.14
eric
-0.14
hatt
-0.14
POSITIVE LOGITS
»
0.16
-ÑĤаки
0.15
ayne
0.15
izm
0.14
ums
0.14
ve
0.13
andon
0.13
EFR
0.13
lish
0.13
angelo
0.13
Activations Density 0.014%