INDEX
Explanations
references to military actions and international conflicts
New Auto-Interp
Negative Logits
à¸ķ
-0.15
sublic
-0.15
sett
-0.15
289
-0.15
isen
-0.15
Mao
-0.14
auty
-0.14
amnesty
-0.14
pale
-0.14
anka
-0.14
POSITIVE LOGITS
Sole
0.27
Iranians
0.25
Iranian
0.25
Horm
0.24
Strait
0.24
Tehran
0.22
Iran
0.20
Revolutionary
0.19
IR
0.18
Pompeo
0.18
Activations Density 0.049%