INDEX
Explanations
references to xenophobia or related terms
New Auto-Interp
Negative Logits
DRAG
-0.76
MENTS
-0.74
captcha
-0.73
LOAD
-0.72
theless
-0.72
IRD
-0.72
LINE
-0.71
LEASE
-0.70
HEAD
-0.69
UGH
-0.69
POSITIVE LOGITS
ophobic
1.60
ophobia
1.44
ophon
1.28
ophob
1.25
ocide
1.25
omorph
1.17
obia
1.09
etics
1.04
opolis
1.02
obl
1.00
Activations Density 0.002%