INDEX
Explanations
themes related to social justice and equity issues
New Auto-Interp
Negative Logits
Priv
-0.15
ware
-0.14
Ware
-0.14
kova
-0.14
Priv
-0.14
Nickname
-0.14
å¯
-0.13
inois
-0.13
Eins
-0.13
ÏĮγ
-0.13
POSITIVE LOGITS
economic
0.26
opportunity
0.23
opportunities
0.22
escape
0.21
independent
0.21
dign
0.20
independence
0.20
ladder
0.20
Economic
0.20
independ
0.20
Activations Density 0.192%