INDEX
Negative Logits
unaware
0.53
unable
0.51
aware
0.49
ただし
0.49
ることができる
0.48
associated
0.46
preventing
0.46
incap
0.46
procedural
0.46
determining
0.45
POSITIVE LOGITS
है
0.77
deserves
0.75
είναι
0.73
выглядит
0.71
is
0.69
është
0.69
简直
0.68
är
0.67
是一个
0.66
seems
0.66
Activations Density 0.001%