INDEX
Explanations
mentions or references to the country India
instances of the word "India."
New Auto-Interp
Negative Logits
yright
-0.97
umbnails
-0.91
htaking
-0.82
olitan
-0.81
esters
-0.81
umbn
-0.80
cies
-0.79
ynt
-0.77
iliary
-0.77
ients
-0.76
POSITIVE LOGITS
India
1.12
Pradesh
1.09
India
1.08
Sharma
1.06
Pakistan
1.01
Pakistan
1.01
Hindus
1.00
Modi
0.98
Kashmir
0.94
Punjab
0.92
Activations Density 0.017%