INDEX
Explanations
references to immigration enforcement and related legal issues
New Auto-Interp
Negative Logits
EFR
-0.15
ltra
-0.15
esome
-0.15
yme
-0.14
ipl
-0.14
ctor
-0.14
åŁĭ
-0.14
ibling
-0.13
ç
-0.13
reb
-0.13
POSITIVE LOGITS
ICE
0.50
immigration
0.46
Immigration
0.46
ICE
0.41
Imm
0.39
deportation
0.37
Imm
0.36
immigrants
0.36
immigrant
0.36
imm
0.32
Activations Density 0.013%