INDEX
Explanations
phrases indicating sports performance and predictions
New Auto-Interp
Negative Logits
onta
-0.18
inator
-0.18
otts
-0.16
awe
-0.15
ot
-0.15
ottes
-0.14
isman
-0.14
otten
-0.14
hil
-0.14
nóng
-0.14
POSITIVE LOGITS
icher
0.15
Keys
0.15
SND
0.14
dac
0.14
иÑħ
0.13
ibel
0.13
AES
0.13
against
0.13
Kew
0.13
ikan
0.13
Activations Density 0.008%