INDEX
Explanations
mentions of geographical locations, specifically Afghanistan
mentions of Afghanistan and related terms
New Auto-Interp
Negative Logits
creen
-0.81
yss
-0.79
vous
-0.74
Hunt
-0.72
Collider
-0.72
constitu
-0.70
*/(
-0.67
ometimes
-0.66
ynt
-0.66
philos
-0.65
POSITIVE LOGITS
istan
1.19
ghan
1.06
Afghanistan
1.04
Afghan
1.03
Kabul
1.02
Taliban
1.00
Albania
0.96
Afgh
0.96
Afghans
0.95
istani
0.93
Activations Density 0.041%