INDEX
Explanations
references to asylum seekers
references to asylum seekers
New Auto-Interp
Negative Logits
ories
-0.75
UT
-0.72
IP
-0.72
IPM
-0.70
Shore
-0.70
ODE
-0.70
AST
-0.68
APS
-0.67
ival
-0.66
OD
-0.66
POSITIVE LOGITS
seekers
1.57
seekers
1.11
seeker
1.07
claimants
0.94
applicants
0.89
seeking
0.81
anamo
0.81
eker
0.79
detainees
0.77
hikers
0.77
Activations Density 0.017%