INDEX
Negative Logits
ленный
-0.08
.Check
-0.08
breakout
-0.08
tục
-0.07
stan
-0.07
Twist
-0.07
teb
-0.07
/Form
-0.07
bericht
-0.07
규
-0.07
POSITIVE LOGITS
Neighborhood
0.09
abol
0.08
lieben
0.08
власти
0.08
Neighborhood
0.08
igh
0.08
neigh
0.08
Neighbour
0.07
neighborhoods
0.07
противоп
0.07
Activations Density 0.001%