INDEX
Explanations
references to Indian culture and cuisine
India and Indian ethnicity
New Auto-Interp
Negative Logits
belea
-0.35
oare
-0.31
rekening
-0.30
redenen
-0.30
proef
-0.30
actie
-0.28
baute
-0.28
佳
-0.28
aktivieren
-0.27
bleven
-0.27
POSITIVE LOGITS
Indian
0.94
India
0.93
Indian
0.93
India
0.91
Indien
0.90
ſelf
0.87
INDIA
0.84
Jefus
0.84
印度
0.84
indian
0.81
Activations Density 0.309%