INDEX
Explanations
words related to personal preferences or choices
instances of the word "favorites" along with related discussions about preferences
New Auto-Interp
Negative Logits
ynthesis
-0.75
yre
-0.70
rogram
-0.65
ynt
-0.63
ploy
-0.63
Act
-0.62
yan
-0.61
orous
-0.60
EMENT
-0.59
aum
-0.59
POSITIVE LOGITS
favorites
3.90
favourites
3.40
contenders
1.61
Favor
1.58
classics
1.55
staples
1.42
selections
1.34
originals
1.33
finalists
1.31
favorite
1.31
Activations Density 0.027%