INDEX
Explanations
references to scores and scoring in various contexts
New Auto-Interp
Negative Logits
ekil
-0.19
orp
-0.18
ait
-0.16
ector
-0.16
ìĤ¬íķŃ
-0.15
uu
-0.15
ENCHMARK
-0.15
lad
-0.15
oral
-0.15
ault
-0.15
POSITIVE LOGITS
card
0.21
cards
0.20
occo
0.19
boards
0.19
rupa
0.17
ycastle
0.17
-UA
0.16
board
0.16
agli
0.16
ุà¸ĩ
0.16
Activations Density 0.016%