INDEX
Explanations
proper names, specifically player names in a sports context
New Auto-Interp
Negative Logits
éĢ
-0.15
Engineers
-0.14
Ginger
-0.14
adar
-0.14
ittel
-0.13
JUnit
-0.13
ipt
-0.13
kart
-0.13
taire
-0.13
agua
-0.13
POSITIVE LOGITS
erot
0.15
çĴĥ
0.15
orsch
0.14
è°±
0.14
elig
0.14
burdens
0.14
jed
0.14
.hs
0.14
XR
0.14
Bias
0.13
Activations Density 0.251%