INDEX
Explanations
people's occupations or professions
statements about individuals who hold leadership or professional roles
New Auto-Interp
Negative Logits
Russians
-0.72
routed
-0.71
humiliating
-0.70
favour
-0.70
degrading
-0.64
ulus
-0.64
unbeliev
-0.63
forcibly
-0.63
bandwagon
-0.62
ppings
-0.62
POSITIVE LOGITS
blogging
0.83
rint
0.75
aspers
0.70
owship
0.69
interstitial
0.69
tml
0.68
director
0.68
lawy
0.67
inical
0.67
arcer
0.66
Activations Density 0.368%