INDEX
Explanations
phrases related to personal or individual choices and inclinations
references to individual or collective preferences
New Auto-Interp
Negative Logits
bane
-0.75
kj
-0.73
Breaking
-0.71
CONCLUS
-0.69
angers
-0.68
amaz
-0.68
wordpress
-0.67
Continental
-0.67
ãĥĥãĥī
-0.67
bleacher
-0.67
POSITIVE LOGITS
preferences
1.08
yip
0.97
preference
0.95
favoring
0.88
elig
0.81
favoured
0.77
eering
0.76
palate
0.75
selection
0.73
pane
0.73
Activations Density 0.012%