INDEX
Explanations
mentions of the country "Saudi Arabia."
references to Saudi Arabia
New Auto-Interp
Negative Logits
onym
-0.74
llo
-0.71
otin
-0.69
aminer
-0.68
ntil
-0.68
gotten
-0.66
HAEL
-0.66
GoldMagikarp
-0.66
ordable
-0.64
ially
-0.64
POSITIVE LOGITS
Arabia
1.51
Arabian
1.25
Aram
0.94
doms
0.92
Abdullah
0.86
Saud
0.85
Salman
0.84
Abdul
0.84
Riy
0.81
Sheikh
0.80
Activations Density 0.026%