INDEX
Explanations
instances of discrimination or unequal treatment based on personal characteristics
Pronouns followed by verbs
actors and their actions
New Auto-Interp
Negative Logits
DockStyle
-0.72
ويكيميديا
-0.65
ghijklmnop
-0.65
MeasureSpec
-0.64
tslint
-0.64
labelledby
-0.64
ifrance
-0.61
Према
-0.61
utafitiHapana
-0.60
těte
-0.59
POSITIVE LOGITS
dared
0.72
dares
0.61
too
0.60
allegedly
0.59
dare
0.58
Too
0.57
Too
0.56
perceived
0.53
too
0.53
supposedly
0.52
Activations Density 0.489%