INDEX
Explanations
content related to equality and recruitment policies
New Auto-Interp
Negative Logits
itespace
-0.16
CJK
-0.15
aco
-0.15
ën
-0.14
rowth
-0.14
ounge
-0.14
IODevice
-0.14
पर
-0.14
plates
-0.14
addock
-0.13
POSITIVE LOGITS
gender
0.28
Gender
0.27
pay
0.25
Equality
0.24
Equal
0.24
equal
0.23
equality
0.23
Gender
0.23
Pay
0.22
gender
0.22
Activations Density 0.007%