INDEX
Negative Logits
viktigt
-0.08
природы
-0.08
textured
-0.07
tai
-0.07
olsun
-0.07
:u
-0.07
nautical
-0.07
valent
-0.07
spaced
-0.07
commanded
-0.07
POSITIVE LOGITS
unjust
0.11
disproportionately
0.11
khiến
0.10
allegedly
0.10
wrongly
0.10
unfair
0.10
alleges
0.10
竟
0.09
导致
0.09
incorrectly
0.09
Activations Density 0.164%