INDEX
Explanations
phrases related to wanting or desiring something
terms related to demand or desirability
New Auto-Interp
Negative Logits
STON
-0.70
ny
-0.70
utive
-0.67
è£
-0.67
mir
-0.65
Te
-0.63
brew
-0.62
dist
-0.62
doi
-0.62
eq
-0.62
POSITIVE LOGITS
anted
1.20
ants
0.76
terday
0.76
zsche
0.75
ishly
0.73
okemon
0.71
anamo
0.71
umar
0.70
explan
0.70
ratulations
0.69
Activations Density 0.012%