INDEX
Explanations
sports teams and their matchups
New Auto-Interp
Negative Logits
inho
-0.17
onio
-0.17
oni
-0.15
ον
-0.15
illon
-0.15
eras
-0.15
inha
-0.15
utto
-0.14
hoe
-0.14
ittle
-0.14
POSITIVE LOGITS
izu
0.16
away
0.15
peng
0.14
Bye
0.14
erva
0.14
against
0.14
ENCHMARK
0.14
directly
0.14
ca
0.13
arning
0.13
Activations Density 0.009%