INDEX
Explanations
references to baseball teams and their historical contexts
New Auto-Interp
Negative Logits
اتÛĮ
-0.16
appro
-0.16
att
-0.15
çε
-0.15
Venez
-0.14
eff
-0.14
ximo
-0.14
anno
-0.14
nuest
-0.14
áv
-0.14
POSITIVE LOGITS
Ãł
0.22
itz
0.22
isme
0.21
è
0.19
ix
0.19
frau
0.18
hist
0.18
.ll
0.18
ll
0.18
ament
0.18
Activations Density 0.047%