INDEX
Explanations
references to Jewish history and identity
New Auto-Interp
Negative Logits
apan
-0.17
ntl
-0.16
uda
-0.16
_locals
-0.15
afka
-0.15
uby
-0.14
erken
-0.14
urre
-0.14
urret
-0.14
Snapshot
-0.14
POSITIVE LOGITS
immigration
0.45
immigrants
0.42
immigr
0.41
Immigration
0.39
immigrant
0.36
migration
0.33
Imm
0.32
em
0.32
migration
0.30
Migration
0.30
Activations Density 0.164%