INDEX
Explanations
The neuron detects mentions of immigration-related terms (visas, refugees, applicants, immigrants) and specific country names.
New Auto-Interp
Negative Logits
blank
-0.07
bert
-0.07
اها
-0.06
Maintenance
-0.06
mash
-0.06
خلف
-0.06
.staff
-0.06
小
-0.06
_gray
-0.06
Merit
-0.06
POSITIVE LOGITS
-sizing
0.06
intrinsic
0.06
Missile
0.06
fprintf
0.06
jur
0.06
리에
0.06
Edison
0.06
辰
0.06
.confirm
0.06
Chem
0.06
Activations Density 0.021%