INDEX
Explanations
references to India and Indian identity
New Auto-Interp
Negative Logits
zard
-0.49
wsp
-0.47
tests
-0.46
zanne
-0.45
zow
-0.44
zat
-0.44
ventes
-0.42
zwungen
-0.42
atorium
-0.41
lucent
-0.41
POSITIVE LOGITS
India
1.05
India
0.98
india
0.90
india
0.88
Indian
0.85
INDIA
0.85
Indien
0.85
Indian
0.84
indian
0.80
indian
0.78
Activations Density 0.008%