INDEX
Explanations
references to competitive sports achievements
New Auto-Interp
Negative Logits
BaÅŁkan
-0.16
Dot
-0.14
Freed
-0.13
ivet
-0.13
trap
-0.13
ometr
-0.13
ãĥ£
-0.13
icast
-0.13
ä¼´
-0.13
_tail
-0.13
POSITIVE LOGITS
win
0.32
win
0.32
winning
0.30
won
0.28
won
0.28
clin
0.27
Win
0.24
wins
0.24
.win
0.24
(win
0.23
Activations Density 0.071%