INDEX
Explanations
references to different countries and their governmental entities, such as the United States and its abbreviations
references to the United States and its designation in various contexts
New Auto-Interp
Negative Logits
fw
-0.79
antry
-0.67
STATS
-0.67
baugh
-0.65
govtrack
-0.64
ault
-0.62
theless
-0.62
accidents
-0.61
votes
-0.60
SERVICE
-0.60
POSITIVE LOGITS
embassy
0.83
Embassy
0.82
rat
0.77
pav
0.72
Samoa
0.70
Indies
0.69
Rubin
0.68
oen
0.68
consulate
0.67
diplomatic
0.66
Activations Density 0.170%