INDEX
Explanations
negative punctuation or symbols indicating strong disapproval
New Auto-Interp
Negative Logits
ungan
-0.16
rish
-0.15
Mali
-0.15
.Metro
-0.14
XObject
-0.14
crossorigin
-0.14
noc
-0.14
.jet
-0.14
;br
-0.14
ousel
-0.13
POSITIVE LOGITS
Watches
0.27
watches
0.26
watch
0.23
-watch
0.22
Hy
0.21
wearer
0.21
watch
0.20
_watch
0.20
Basel
0.20
Hy
0.20
Activations Density 0.000%