INDEX
Explanations
references to awards and accolades
New Auto-Interp
Negative Logits
isay
-0.17
ults
-0.16
pone
-0.15
äll
-0.15
éis
-0.15
ickness
-0.14
kiye
-0.14
aji
-0.14
ÏĨÏħ
-0.13
ento
-0.13
POSITIVE LOGITS
winning
0.52
-winning
0.49
win
0.40
Winning
0.39
win
0.30
Win
0.28
WIN
0.28
winner
0.27
winners
0.26
WIN
0.26
Activations Density 0.011%