INDEX
Negative Logits
pare
-0.07
pří
-0.06
Displays
-0.06
fundament
-0.06
biểu
-0.06
(pl
-0.06
taş
-0.06
free
-0.06
bfs
-0.06
预
-0.06
POSITIVE LOGITS
shrinking
0.07
shrink
0.07
roman
0.06
Shutdown
0.06
涨
0.06
hudeb
0.06
wik
0.06
ных
0.06
adjustments
0.06
suffers
0.06
Activations Density 0.004%