INDEX
Explanations
names of baseball players and references to game scores
New Auto-Interp
Negative Logits
бов
-0.17
lý
-0.16
izza
-0.15
_AF
-0.14
mis
-0.14
lich
-0.14
боÑĤ
-0.14
riteln
-0.14
ROTO
-0.14
pyx
-0.14
POSITIVE LOGITS
ioned
0.16
ccione
0.15
Pom
0.15
eshire
0.15
Chew
0.14
INESS
0.14
ronic
0.14
odox
0.13
ado
0.13
ENCIL
0.13
Activations Density 0.000%