INDEX
Negative Logits
ummer
-0.10
Watt
-0.10
vine
-0.09
ela
-0.09
exels
-0.09
pong
-0.09
/power
-0.09
rap
-0.09
æĺŃåĴĮ
-0.08
ADM
-0.08
POSITIVE LOGITS
behind
0.18
responsible
0.15
milit
0.14
à¹ĥà¸Ļà¸ģาร
0.13
Behind
0.13
why
0.13
contrib
0.13
Contrib
0.12
etermin
0.11
driving
0.11
Activations Density 0.055%