INDEX
Explanations
references to Jewish identity and culture
New Auto-Interp
Negative Logits
tings
-0.17
aż
-0.15
Judaism
-0.15
ert
-0.15
Ñīик
-0.14
isin
-0.14
catholic
-0.14
exact
-0.14
abr
-0.14
jewish
-0.14
POSITIVE LOGITS
ness
0.29
-Christian
0.25
-owned
0.21
Feder
0.21
-major
0.20
-Owned
0.19
-American
0.18
ly
0.18
feder
0.18
-Muslim
0.18
Activations Density 0.016%