INDEX
Explanations
references to the country "Pakistan."
occurrences of the word "Pakistan."
New Auto-Interp
Negative Logits
mble
-0.74
llo
-0.69
Lys
-0.67
LU
-0.67
Merrill
-0.63
llan
-0.62
umbnails
-0.62
ĵ
-0.61
Bucc
-0.61
ansk
-0.60
POSITIVE LOGITS
istani
1.51
awar
0.99
istan
0.99
abad
0.94
pour
0.85
adesh
0.84
Sharif
0.82
is
0.80
Taliban
0.79
Pakistan
0.78
Activations Density 0.036%