INDEX
Explanations
references to Iran and its political context
New Auto-Interp
Negative Logits
ering
-0.18
gle
-0.17
aign
-0.16
les
-0.15
gy
-0.15
geh
-0.15
ायत
-0.15
cing
-0.15
going
-0.15
بÙĪØ§Ø¨Ø©
-0.14
POSITIVE LOGITS
ian
0.27
Revolutionary
0.22
ophobia
0.20
ians
0.20
ious
0.19
(IR
0.19
anian
0.19
Tehran
0.19
inan
0.19
IAN
0.18
Activations Density 0.014%