INDEX
Explanations
phrases expressing preference or inclination
New Auto-Interp
Negative Logits
754
-0.16
ady
-0.15
seau
-0.15
sein
-0.14
oui
-0.14
912
-0.14
finity
-0.14
oro
-0.14
VERR
-0.14
izu
-0.13
POSITIVE LOGITS
to
0.16
igr
0.15
(exports
0.15
èIJ
0.14
utenberg
0.14
assistance
0.14
ardon
0.13
ToAdd
0.13
ordin
0.13
igious
0.13
Activations Density 0.014%