INDEX
Negative Logits
cittadini
0.50
viciss
0.47
ദ്ധതി
0.46
verursacht
0.46
waging
0.46
肪
0.46
verwenden
0.46
tentando
0.45
korišten
0.45
用於
0.44
POSITIVE LOGITS
6
0.76
7
0.76
8
0.75
3
0.72
5
0.72
1
0.71
2
0.71
4
0.65
two
0.57
three
0.57
Activations Density 0.002%