INDEX
Explanations
references to locations within Washington D.C
references to Washington D.C
New Auto-Interp
Negative Logits
Bulg
-0.82
Ambro
-0.71
conversions
-0.70
Canaver
-0.68
terday
-0.66
Aph
-0.66
Penguin
-0.66
Boko
-0.65
Zoro
-0.65
cules
-0.65
POSITIVE LOGITS
isco
1.22
etermination
1.06
etermined
1.02
etermin
1.01
olph
0.99
uh
0.99
enton
0.99
immer
0.98
edu
0.97
aug
0.97
Activations Density 0.015%