INDEX
Explanations
words related to negative attitudes or prejudices, particularly relating to specific groups or characteristics
terms related to various forms of phobia and intolerance
New Auto-Interp
Negative Logits
MER
-0.85
aver
-0.82
unes
-0.78
upper
-0.76
allo
-0.73
selection
-0.71
mel
-0.71
arist
-0.71
aple
-0.70
selected
-0.70
POSITIVE LOGITS
ophobia
1.31
ophobic
1.22
obia
1.04
ï¸
1.00
ophob
0.99
obic
0.94
osate
0.84
prejudice
0.81
bigotry
0.81
homophobia
0.79
Activations Density 0.012%