INDEX
Explanations
words that express a positive sentiment or quality
New Auto-Interp
Negative Logits
myſelf
-0.78
defences
-0.76
InitVars
-0.75
aback
-0.74
inflater
-0.74
Ostat
-0.70
liesslich
-0.70
Thine
-0.70
loisirs
-0.69
tyres
-0.68
POSITIVE LOGITS
wonderful
1.76
Wonderful
1.67
wonderful
1.64
Wonderful
1.62
marvelous
1.22
WONDER
1.19
wondrous
1.16
wonderfully
1.10
merveilleux
1.07
maravilloso
1.03
Activations Density 0.055%