INDEX
Explanations
terms related to winning or victories
New Auto-Interp
Negative Logits
olu
-0.16
cia
-0.15
leh
-0.14
AMAGE
-0.14
yll
-0.14
360
-0.14
pace
-0.14
luž
-0.13
Ñī
-0.13
ujet
-0.13
POSITIVE LOGITS
nable
0.24
-win
0.20
ced
0.19
ess
0.18
throp
0.17
NF
0.16
-loss
0.16
/win
0.16
battles
0.16
now
0.15
Activations Density 0.086%