INDEX
Explanations
references to Israel, Jews, and Zionism
New Auto-Interp
Negative Logits
@"/
-0.43
PEAT
-0.40
"${-0.39
sizeCache
-0.38
="${-0.36
("${-0.36
exitRule
-0.36
simi
-0.36
Rust
-0.35
Zum
-0.34
POSITIVE LOGITS
Israel
1.33
Israel
1.26
ISRAEL
1.19
Israeli
1.17
israel
1.13
Israël
1.05
Israeli
1.04
Israelis
1.04
Jewish
1.02
israel
1.02
Activations Density 0.276%