INDEX
Explanations
proper nouns related to politics and current affairs
occurrences of the word "Washington."
New Auto-Interp
Negative Logits
order
-0.74
ordering
-0.69
gger
-0.69
lass
-0.63
compan
-0.62
otropic
-0.62
aster
-0.61
hang
-0.60
odic
-0.60
asters
-0.60
POSITIVE LOGITS
ASHINGTON
1.37
WASHINGTON
1.35
aukee
1.03
Washington
0.95
ukong
0.95
ashtra
0.93
DC
0.88
ITED
0.87
DERR
0.86
sburgh
0.86
Activations Density 0.005%