INDEX
Negative Logits
INTERESAR
-0.62
Vikipedi
-0.61
Оно
-0.56
Himself
-0.49
LEncoder
-0.48
cùng
-0.48
собі
-0.47
myself
-0.46
arith
-0.45
parlé
-0.45
POSITIVE LOGITS
the
0.96
how
0.93
whether
0.90
much
0.69
most
0.69
MUCH
0.68
much
0.66
многое
0.66
what
0.64
alot
0.63
Activations Density 0.002%