INDEX
Explanations
words related to attractiveness or desirability
references to attractiveness in various contexts
New Auto-Interp
Negative Logits
othe
-0.78
cedented
-0.78
bel
-0.78
ignt
-0.72
iche
-0.71
jer
-0.71
FIN
-0.69
cham
-0.69
REL
-0.67
rel
-0.67
POSITIVE LOGITS
lure
0.93
attractive
0.87
proposition
0.87
Flavoring
0.80
targets
0.79
prospects
0.74
attractiveness
0.73
lihood
0.73
attract
0.73
enticing
0.73
Activations Density 0.023%