INDEX
Explanations
phrases related to white supremacy
references to white supremacy and related extremist groups
New Auto-Interp
Negative Logits
fixed
-0.72
Glass
-0.72
MAC
-0.72
inet
-0.71
Sleep
-0.69
zl
-0.67
borne
-0.67
Body
-0.66
Mount
-0.66
Lev
-0.66
POSITIVE LOGITS
supremacist
0.97
supremacists
0.94
suprem
0.92
guiActiveUn
0.86
Klux
0.83
manifesto
0.78
sympath
0.78
prejudice
0.76
swast
0.76
ideology
0.76
Activations Density 0.026%