INDEX
Explanations
mentions of the word "American" in the text
references to American culture and identity
New Auto-Interp
Negative Logits
*/(
-0.75
heed
-0.73
orders
-0.68
RH
-0.67
razil
-0.67
NB
-0.66
aird
-0.64
_>
-0.63
KR
-0.62
order
-0.61
POSITIVE LOGITS
Airlines
1.20
Samoa
1.15
Idol
1.05
ICAN
0.94
Express
0.94
icus
0.90
Sniper
0.88
Indians
0.86
ized
0.85
Horror
0.85
Activations Density 0.067%