INDEX
Explanations
words related to negative or unpleasant situations
the prefix "uns," indicating negation or absence
New Auto-Interp
Negative Logits
SHIP
-0.75
Grayson
-0.70
stanbul
-0.68
zzo
-0.68
Mercury
-0.67
OPLE
-0.67
*/(
-0.66
tsky
-0.66
Guardians
-0.66
anwhile
-0.63
POSITIVE LOGITS
uns
0.92
avour
0.90
rep
0.89
apon
0.81
oci
0.80
uitive
0.80
conv
0.79
heat
0.79
concess
0.78
alted
0.76
Activations Density 0.006%