INDEX
Negative Logits
our
-1.84
(
-1.20
but
-1.13
like
-1.09
one
-1.03
we
-1.02
then
-1.01
āju
-0.97
any
-0.94
that
-0.94
POSITIVE LOGITS
LÄ
1.26
Válasz
1.23
eventuell
1.16
⎼
1.15
Hogyan
1.11
ѝ
1.11
hakkında
1.08
Hozzá
1.06
flä
1.05
Cannot
1.05
Activations Density 0.001%