INDEX
Explanations
references to wealth or wealthy individuals
references to wealth and wealthy individuals
New Auto-Interp
Negative Logits
uality
-0.80
Airl
-0.79
yrinth
-0.78
PRESS
-0.75
DOC
-0.74
shows
-0.72
uberty
-0.71
WAYS
-0.69
bsp
-0.68
pty
-0.67
POSITIVE LOGITS
earners
1.06
Asians
0.88
donors
0.86
landowners
0.82
richer
0.81
lier
0.80
wealthier
0.80
elites
0.79
citiz
0.79
Institution
0.79
Activations Density 0.026%