INDEX
Explanations
words related to personal preferences
references to personal preferences and choices
New Auto-Interp
Negative Logits
bane
-0.79
mans
-0.76
wordpress
-0.75
mberg
-0.74
kj
-0.73
Interstitial
-0.72
icles
-0.72
xx
-0.70
ãĤ¤ãĥĪ
-0.70
angel
-0.68
POSITIVE LOGITS
preferences
1.00
favoring
0.94
preference
0.87
eering
0.84
yip
0.84
pane
0.80
ļéĨĴ
0.79
elig
0.79
favoured
0.78
selection
0.73
Activations Density 0.019%