INDEX
Explanations
proper nouns indicating a person's birthplace
references to people’s places of birth
New Auto-Interp
Negative Logits
phis
-0.98
phasis
-0.94
awaru
-0.86
qqa
-0.80
olicy
-0.80
raviolet
-0.77
eredith
-0.76
uyomi
-0.76
ivably
-0.74
erguson
-0.74
POSITIVE LOGITS
ness
0.85
lings
0.80
nesses
0.79
born
0.74
ling
0.70
aways
0.69
iste
0.68
stellar
0.66
abroad
0.65
éļ
0.64
Activations Density 0.023%