INDEX
Negative Logits
хочется
0.58
reszt
0.52
anggil
0.43
stessi
0.42
allerlei
0.42
तमाम
0.42
कायदा
0.42
Plenty
0.41
Ende
0.40
ردم
0.40
POSITIVE LOGITS
statement
0.90
correctly
0.89
statements
0.86
correct
0.84
CORRECT
0.83
doğrud
0.79
下列
0.77
正确
0.74
कथन
0.74
corretamente
0.73
Activations Density 0.032%