INDEX
Explanations
references to historical figures and establishments in community or business contexts
New Auto-Interp
Negative Logits
Deniz
-0.18
Manitoba
-0.16
shine
-0.16
çĸĨ
-0.15
rael
-0.15
Alps
-0.15
Sahara
-0.15
apos
-0.15
Tibet
-0.15
Alberta
-0.15
POSITIVE LOGITS
Dutch
0.28
Malay
0.25
Malays
0.24
Stra
0.23
Chinese
0.22
Java
0.22
Mal
0.21
Stamford
0.20
Chinese
0.20
Mal
0.20
Activations Density 0.031%