INDEX
Explanations
proper nouns specifically related to the city of Washington
mentions of Washington, D.C
New Auto-Interp
Negative Logits
gger
-0.80
Torrent
-0.73
Boo
-0.69
ongo
-0.68
order
-0.67
Redditor
-0.67
eworld
-0.67
complete
-0.65
hang
-0.64
lass
-0.64
POSITIVE LOGITS
ASHINGTON
1.09
aukee
0.97
NESS
0.97
WASHINGTON
0.97
STATE
0.93
STATES
0.92
ashtra
0.88
GOODMAN
0.87
ADA
0.85
DC
0.84
Activations Density 0.006%