INDEX
Negative Logits
тивным
0.83
unwillingness
0.76
olischen
0.73
ாள்
0.71
rait
0.70
を与
0.69
requires
0.68
を与える
0.68
жет
0.67
atschapp
0.67
POSITIVE LOGITS
have
1.44
want
1.28
are
1.26
have
1.21
cannot
1.20
believe
1.19
έχουν
1.18
têm
1.18
aren
1.17
tengo
1.16
Activations Density 0.199%