INDEX
Explanations
information about government, legislation, and political events
references to government entities or organizations
New Auto-Interp
Negative Logits
decomp
-0.73
Negro
-0.73
paperback
-0.68
negro
-0.65
regul
-0.65
triv
-0.65
whiff
-0.65
puff
-0.64
readings
-0.64
reprodu
-0.64
POSITIVE LOGITS
ï¸ı
1.10
SHIP
0.90
taboola
0.86
ï¸
0.85
ship
0.84
VICE
0.84
ATH
0.82
SHA
0.81
IVERS
0.80
STEM
0.79
Activations Density 0.471%