INDEX
Explanations
words that express beauty or aesthetics
New Auto-Interp
Negative Logits
beauty
-0.18
ativ
-0.17
beaut
-0.17
Beauty
-0.17
Beauty
-0.16
ç¾İ
-0.16
Beaut
-0.15
isure
-0.15
ette
-0.15
greatness
-0.15
POSITIVE LOGITS
lest
0.33
mente
0.20
ly
0.20
ness
0.18
-looking
0.18
zza
0.17
llll
0.15
ترÛĮÙĨ
0.15
.timing
0.15
(er
0.15
Activations Density 0.078%