INDEX
Explanations
names of political leaders and figures, particularly related to Iran
references to political leaders and their authority
New Auto-Interp
Negative Logits
ment
-0.82
Oregon
-0.72
horn
-0.69
leg
-0.69
mented
-0.68
Dover
-0.68
Enchant
-0.66
Ops
-0.66
Sussex
-0.65
Delaware
-0.65
POSITIVE LOGITS
enei
1.23
pour
0.98
nis
0.94
ollah
0.84
Seym
0.82
dissidents
0.81
Kham
0.78
rpm
0.76
idium
0.76
xual
0.72
Activations Density 0.056%