INDEX
Negative Logits
hesitation
-0.08
dbg
-0.08
Disp
-0.07
तार
-0.07
homeowners
-0.07
'assurance
-0.07
dissolved
-0.07
outward
-0.07
anglers
-0.07
দ্র
-0.07
POSITIVE LOGITS
Orwell
0.13
oppressive
0.11
dyst
0.10
ressent
0.09
authoritarian
0.09
习近平
0.08
生产
0.08
inux
0.08
oppression
0.08
conspiracy
0.08
Activations Density 0.003%