INDEX
Negative Logits
解説
0.44
容易
0.42
ಅಂಶ
0.42
Lik
0.41
inappropri
0.41
explan
0.41
tendencies
0.41
різні
0.40
strutt
0.39
unsuitable
0.39
POSITIVE LOGITS
überhaupt
0.77
вообще
0.69
should
0.66
भला
0.59
わざ
0.59
bother
0.57
Should
0.57
needed
0.57
siquiera
0.54
اصلا
0.54
Activations Density 0.008%