INDEX
Explanations
words related to baseball such as 'bat'
references to bats
New Auto-Interp
Negative Logits
ãĥ´ãĤ¡
-0.97
xia
-0.77
FINE
-0.75
zanne
-0.71
Ô
-0.65
eous
-0.65
Publisher
-0.65
vre
-0.65
oppable
-0.63
éĸ
-0.63
POSITIVE LOGITS
tered
0.90
bat
0.80
manship
0.78
bat
0.77
chers
0.76
ista
0.75
oche
0.75
bats
0.74
hit
0.73
cher
0.72
Activations Density 0.012%