INDEX
Explanations
references to the United States or its abbreviation, USA, in the text
mentions of the United States
New Auto-Interp
Negative Logits
rog
-0.72
Wo
-0.69
experimental
-0.65
bridges
-0.65
observational
-0.64
hanging
-0.63
complex
-0.63
narr
-0.61
marks
-0.61
intersection
-0.61
POSITIVE LOGITS
USA
4.24
USA
2.13
usa
1.61
US
1.56
United
1.41
UK
1.39
Canada
1.30
America
1.26
Australia
1.26
SPA
1.17
Activations Density 0.012%