INDEX
Explanations
names of locations or institutions, specifically related to news articles
occurrences of the word "Washington."
New Auto-Interp
Negative Logits
gger
-0.74
order
-0.74
fm
-0.70
{*-0.66
Boo
-0.65
lass
-0.64
loss
-0.63
hist
-0.63
Occ
-0.63
eworld
-0.63
POSITIVE LOGITS
ASHINGTON
1.18
aukee
1.12
WASHINGTON
0.96
ashtra
0.95
DC
0.87
MENTS
0.85
NESS
0.84
ADA
0.83
STATE
0.82
nesday
0.80
Activations Density 0.007%