INDEX
Explanations
phrases containing the prefix "non-" followed by a hyphen and a word
references to non-white individuals or groups
New Auto-Interp
Negative Logits
fallacy
-0.88
downfall
-0.87
mock
-0.82
dilemma
-0.81
ridicule
-0.81
folly
-0.81
undo
-0.80
disguise
-0.80
dossier
-0.78
caveat
-0.77
POSITIVE LOGITS
citizens
1.55
white
1.45
members
1.38
resident
1.38
humans
1.36
Americans
1.35
wh
1.35
immigrant
1.33
married
1.31
profits
1.31
Activations Density 0.067%