INDEX
Explanations
the term "Best" in various contexts related to rankings or evaluations
New Auto-Interp
Negative Logits
IGHTS
-0.74
gypt
-0.73
heter
-0.70
probing
-0.69
chy
-0.67
mble
-0.64
vor
-0.63
bryce
-0.62
pher
-0.61
gy
-0.61
POSITIVE LOGITS
iary
1.14
seller
1.13
iaries
1.04
Practices
0.93
sell
0.89
selling
0.86
Selling
0.85
ow
0.81
Friend
0.78
owing
0.78
Activations Density 0.021%