INDEX
Explanations
references to the United States and its variations
New Auto-Interp
Negative Logits
Tikang
-0.79
chtigkeit
-0.78
ClientSize
-0.72
BORROW
-0.72
gdx
-0.70
DbType
-0.67
évaluateur
-0.64
Mombasa
-0.63
jor
-0.63
Chomsky
-0.63
POSITIVE LOGITS
United
1.50
United
1.31
UNITED
1.24
UNITED
1.20
united
0.93
Unite
0.87
united
0.86
tieth
0.85
المتحدة
0.84
vált
0.82
Activations Density 0.074%