INDEX
Explanations
expressions of desire or preference
phrases expressing a desire or wish
New Auto-Interp
Negative Logits
furt
-0.84
orno
-0.82
ively
-0.79
ById
-0.70
abiding
-0.70
subur
-0.65
jah
-0.65
ishly
-0.63
packing
-0.62
utical
-0.62
POSITIVE LOGITS
hear
1.17
see
1.16
emulate
0.99
clarify
0.95
revisit
0.94
receive
0.94
collaborate
0.93
contribute
0.93
incorporate
0.91
recreate
0.90
Activations Density 0.086%