INDEX
Explanations
This neuron detects mentions of racial or ethnic group labels (e.g., “African American,” “Asian,” etc.) in census‐style demographic listings.
New Auto-Interp
Negative Logits
σιο
-0.07
玛
-0.07
filtro
-0.06
těž
-0.06
мног
-0.06
ьогодні
-0.06
($("#-0.06
ixture
-0.06
ف
-0.06
uran
-0.06
POSITIVE LOGITS
.ToString
0.07
番組
0.06
лог
0.06
_assigned
0.06
Agent
0.06
definitive
0.06
Decorator
0.06
Spacer
0.06
Banc
0.06
như
0.06
Activations Density 0.001%