INDEX
Explanations
references to the country Afghanistan
mentions of Afghanistan and related geopolitical contexts
New Auto-Interp
Negative Logits
Hunt
-0.81
vous
-0.77
ateur
-0.76
aturally
-0.72
tower
-0.70
abet
-0.69
ozyg
-0.69
LOS
-0.68
ection
-0.68
erd
-0.67
POSITIVE LOGITS
Afghanistan
1.18
istan
1.12
Kabul
1.09
ghan
1.02
Albania
1.00
Afghans
0.99
Afgh
0.98
Afghan
0.98
Taliban
0.93
Pakistan
0.92
Activations Density 0.006%