INDEX
Explanations
comparative terms expressing superiority
phrases indicating the concept of "best" or optimal choices
New Auto-Interp
Negative Logits
idon
-0.74
onut
-0.73
Dru
-0.69
chy
-0.67
Reloaded
-0.66
Emer
-0.65
Smy
-0.62
Manson
-0.62
probing
-0.62
oidal
-0.62
POSITIVE LOGITS
seller
1.14
iary
1.06
iaries
1.03
suited
1.00
sell
0.97
ow
0.94
imates
0.87
rade
0.86
ower
0.84
hest
0.79
Activations Density 0.039%