INDEX
Negative Logits
brig
-0.17
CCR
-0.16
arness
-0.16
addslashes
-0.15
kou
-0.14
ĤŃ
-0.14
abus
-0.14
راÙĤ
-0.14
408
-0.13
ulus
-0.13
POSITIVE LOGITS
izing
0.19
izations
0.17
izes
0.17
ist
0.17
iz
0.17
Oak
0.16
ize
0.16
ising
0.16
TY
0.16
sted
0.16
Activations Density 0.013%