INDEX
Explanations
references to Jewish identity and culture
New Auto-Interp
Negative Logits
resco
-0.16
topl
-0.16
syst
-0.15
SSIP
-0.14
erdale
-0.14
icopt
-0.14
dbl
-0.14
pard
-0.14
hooks
-0.14
outer
-0.14
POSITIVE LOGITS
Factor
0.16
Band
0.15
och
0.15
uali
0.15
oram
0.15
tÃŃch
0.14
sm
0.14
self
0.14
tz
0.14
esh
0.14
Activations Density 0.030%