INDEX
Explanations
negative associations and societal expectations related to various topics including stigma, judgment, sexism, and body norms
New Auto-Interp
Negative Logits
ptroller
-0.84
Launcher
-0.76
Laun
-0.69
TPPStreamerBot
-0.69
engine
-0.67
issions
-0.65
iago
-0.65
hire
-0.65
maxwell
-0.64
prototype
-0.63
POSITIVE LOGITS
stereotypes
1.14
stigmat
1.10
stigma
1.10
stereotyp
1.10
notions
1.08
dehuman
1.05
prejudices
1.02
stereotype
1.02
biases
1.02
ingrained
1.01
Activations Density 0.423%