INDEX
Explanations
occurrences of the word "India" in various contexts
New Auto-Interp
Head Attr Weights
0:0.02
1:0.29
2:0.02
3:0.18
4:0.02
5:0.09
6:0.02
7:0.14
8:0.09
9:0.01
10:0.03
11:0.02
Negative Logits
glim
-2.98
depending
-2.79
oresc
-2.64
spe
-2.62
bage
-2.38
illet
-2.35
vous
-2.33
ヴァ
-2.29
AGES
-2.28
pron
-2.22
POSITIVE LOGITS
Secondly
3.25
Mongolia
2.53
secondly
2.39
DPRK
2.35
Mysterious
2.30
Pakistan
2.28
Malays
2.21
Secondly
2.14
Mald
2.14
Nepal
2.14
Activations Density 0.003%