INDEX
Explanations
references to anti-Semitism and related accusations
New Auto-Interp
Negative Logits
uger
-0.16
AssignableFrom
-0.15
iller
-0.15
insurgency
-0.14
vale
-0.14
strconv
-0.13
inand
-0.13
Epidemi
-0.13
WXYZ
-0.13
wend
-0.13
POSITIVE LOGITS
Israel
0.38
Palestine
0.35
BDS
0.35
Israeli
0.35
Palestinian
0.34
Zion
0.33
Zionist
0.32
Pale
0.32
Israel
0.31
CAMERA
0.30
Activations Density 0.099%