INDEX
Explanations
words that express beauty or describe aesthetically pleasing qualities
New Auto-Interp
Negative Logits
ad
-0.16
erty
-0.16
greatness
-0.15
ativ
-0.15
ette
-0.15
sexual
-0.14
isure
-0.14
ot
-0.14
uten
-0.13
illary
-0.13
POSITIVE LOGITS
lest
0.30
mente
0.20
ly
0.19
zza
0.17
.dense
0.16
emente
0.16
-looking
0.15
Äįem
0.15
ablish
0.15
azer
0.15
Activations Density 0.068%